publications
Iruskieta, M; Arrieta, E.
Handwritten text recognition task over Basque exam papers (2022)
IMPACT 10th Anniversary Workshop and Writing Sprint. Alicante
Mikel Larrañaga, Itziar Aldabe, Ana Arruarte, Jon A. Elorriaga, Montse Maritxalar
A Qualitative Case Study on the Validation of Automatically Generated Multiple-Choice Questions From Science Textbooks (2022)
M. Larrañaga, I. Aldabe, A. Arruarte, J. A. Elorriaga and M. Maritxalar, "A Qualitative Case Study on the Validation of Automatically Generated Multiple-Choice Questions From Science Textbooks," in IEEE Transactions on Learning Technologies, vol. 15, no. 3, pp. 338-349, 1 June 2022, doi: 10.1109/TLT.2022.3171589.
Altuna, B., Iruskieta, M., Estarrona, A., Farwell, A., Arriola, JM., Alkorta J., Arregi., X
CLARIAH-EUS: Building a Cross-border CLARIAH Node for the Basque Language (2022)
Digital Humanities Conference November 23-25. Budapest.
Eneko Agirre
Few-shot Information Extraction is Here: Pre-train, Prompt and Entail (2022)
SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
Nayla Escribano, Jon Ander González, Julen Orbegozo-Terradillos, Ainara Larrondo-Ureta, Simón Peña-Fernández, Olatz Pérez-de-Viñaspre, Rodrigo Agerri
Euskararen erabilera Eusko Legebiltzarreko debateetan (2012-2020) (2022)
Nayla Escribano, Jon Ander González, Julen Orbegozo-Terradillos, Ainara Larrondo-Ureta, Simón Peña-Fernández, Olatz Pérez-de-Viñaspre, Rodrigo Agerri (2022). Euskararen erabilera Eusko Legebiltzarreko debateetan (2012-2020). In Mediatika, 19, 163-178.
Aitor Ormazabal, Mikel Artetxe, Manex Agirrezabal, Aitor Soroa, Eneko Agirre
PoeLM: A Meter- and Rhyme-Controllable Language Model for Unsupervised Poetry Generation (2022)
Findings of the Association for Computational Linguistics: EMNLP 2022
Itziar Glez Dios, Aitor Soroa, Hugo Laurençon, Lucile Saulnier, Thomas Wang, Christopher Akiki, Albert Villanova del Moral, Teven Le Scao, Leandro Von Werra, Chenghao Mou, Eduardo González Ponferrada, Huu Nguyen, Jörg Frohberg, Mario Šaško, Quentin Lhoest, Angelina McMillan-Major, Gérard Dupont, Stella Biderman, Anna Rogers, Loubna Ben Allal, Francesco de Toni, Giada Pistilli, Olivier Nguyen, Somaieh Nikpoor, Maraim Masoud, Pierre Colombo, Javier de la Rosa, Paulo Villegas, Tristan Thrush, etal.
The BigScience ROOTS Corpus: A 1.6 TB Composite Multilingual Dataset (2022)
2022. Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track
Gonzalez-Dios, Itziar and Altuna, Begoña
Natural Language Processing and Language Technologies for the Basque Language (2022)
Gonzalez-Dios, Itziar and Altuna, Begoña (2022). Natural Language Processing and Language Technologies for the Basque Language. In Cuadernos Europeos de Deusto. NÚMERO ESPECIAL. Linguas minoritarias e futuro de Europa. Minority Languages and the Future of Europe 26, 203-230. https://doi.org/10.18543/ced.2477 https://ced.revistas.deusto.es/issue/view/285
Mikel Iruskieta, Ainara Estarrona, Aritz Farwell, German Rigau
INTELE: promoviendo la participación en las infraestructuras: CLARIN y DARIAH (2022)
The International Congress on Libraries & Digital Humanities: projects and challenges
Itziar Gonzalez-Dios, Iker Gutiérrez-Fandiño, Oscar M. Cumbicus-Pineda, Aitor Soroa
IrekiaLF_es: a new open benchmark and baseline systems for Spanish Automatic Text Simplification (2022)
Gonzalez-Dios, I., Gutiérrez-Fandiño, I., Cumbicus-Pineda, O. M., & Soroa, A. (2022, December). IrekiaLFes: a new open benchmark and baseline systems for Spanish automatic text simplification. In Proceedings of the Workshop on Text Simplification, Accessibility, and Readability (TSAR-2022) (pp. 86-97).
Xabier Soto, Olatz Perez-De-Viñaspre, Gorka Labaka, Maite Oronoz
Comparing and combining tagging with different decoding algorithms for back-translation in NMT: learnings from a low resource scenario (2022)
In Proceedings of the 23rd Annual Conference of the European Association for Machine Translation, pages 31–40, Ghent, Belgium. European Association for Machine Translation.
Ona de Gibert Bonet, Iakes Goenaga, Olatz Perez-de-Viñaspre, Jordi Armengol-Estapé, Carla Parra Escartín, Marina Sanchez, Mārcis Pinnis, Gorka Labaka and Maite Melero
Unsupervised Machine Translation in Real-World Scenarios (2022)
Proceedings of the 13th Edition of the Language Resources and Evaluation Conference (LREC 2022)
Nayla Escribano, Jon Ander González, Julen Orbegozo-Terradillos, Ainara Larrondo-Ureta, Simón Peña-Fernández, Olatz Perez-de-Viñaspre, Rodrigo Agerri
BasqueParl: A Bilingual Corpus of Basque Parliamentary Transcriptions (2022)
Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 3382–3390, Marseille, France. European Language Resources Association.
Mikel Iruskieta, Ainara Estarrona, Aritz Farwell, German Rigau
INTELE : promoviendo la participación en las infraestructuras ERIC CLARIN y DARIAH (2022)
Boletín ANABAD. LXXII (2022), NÚM. 2, ABRIL-JUNIO. MADRID. ISSN: 2794-0519 (USB) - 2444-7293 (Internet)
Mikel Artetxe, Itziar Aldabe, Rodrigo Agerri, Olatz Perez-de-Viñaspre, Aitor Soroa
Does Corpus Quality Really Matter for Low-Resource Languages? (2022)
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 7383–7390.
Izaskun Aldezabal, Jose Mari Arriola, Arantxa Otegi
TZOS: an Online Terminology Database Aimed at Working on Basque Academic Terminology Collaboratively (2022)
Proceedings of the 13th Language Resources and Evaluation Conference. Editors: Nicoletta Calzolari (Conference chair), Fred´ eric B ´ echet, Philippe Blache, Khalid Choukri, ´ Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hel´ ene Mazo, Jan Odijk, Stelios Piperidis
I. Aduriz, I. Alegria, I. Aldezabal, X. Artola, A. Díaz de Ilarraza, N. Ezeiza, K. Sarasola, R. Urizar
Euskara (batua) ingurune digitalean: bidean ikasiXa eta etorkizuneko erronkak (2022)
file2 (2022)
Arantzazutik mundu zabalera: Euskararen normatibizazioa: 1968-2018. “Euskara (batua) ingurune digitalean: bidean ikasiXa eta etorkizuneko erronkak”. Andres M. Urrutia (Arg.). Iker bilduma. 455-470. Euskaltzaindia – Iberoamericana-Vervuert. 2022
Aitor Almeida, Unai Bermejo, Aritz Bilbao, Gorka Azkune, Unai Aguilera, Mikel Emaldi, Fadi Dornaika, Ignacio Arganda-Carreras
A Comparative Analysis of Human Behavior Prediction Approaches in Intelligent Environments (2022)
Sensors, vol 22, Issue 3, pp 701
Marta Gianzo, Itziar Urizar-Arenaza, Iraia Muñoa-Hoyos, Gorka Labaka, Zaloa Larreategui, Nicolás Garrido, Jon Irazusta, Nerea Subirán
Sperm aminopeptidase N identifies the potential for high-quality blastocysts and viable embryos in oocyte-donation cycles (2022)
Human Reproduction, Volume 37, Issue 10, October 2022, Pages 2246–2254
Cristina Aceta, Izaskun Fernandez, Aitor Soroa
KIDE4I: A Generic Semantics-Based Task-Oriented Dialogue System for Human-Machine Interaction in Industry 5.0 (2022)
Applied Sciences 12, no. 3: 1192
Blanca Calvo Figueras, Montse Cuadros, Rodrigo Agerri
A Semantics-Aware Approach to Automated Claim Verification (2022)
In Proceedings of the Fifth Fact Extraction and VERification Workshop (FEVER), pages 37–48, Dublin, Ireland. Association for Computational Linguistics
Jeremy Barnes, Laura Oberlaender, Enrica Troiano, Andrey Kutuzov, Jan Buchmann, Rodrigo Agerri, Lilja Øvrelid, Erik Velldal
SemEval 2022 Task 10: Structured Sentiment Analysis (2022)
In SemEval 2022
Gorka Urbizu, Iñaki San Vicente, Xabier Saralegi, Rodrigo Agerri and Aitor Soroa
BasqueGLUE: A Natural Language Understanding Benchmark for Basque (2022)
LREC 2022
Maxime Masson, Christian Sallaberry, Rodrigo Agerri, Marie-Noelle Bessagnet, Philippe Roose, Annig Le Parc Lacayrelle
A Domain-Independent Method for Thematic Dataset Building from Social Media: The Case of Tourism on Twitter (2022)
In: Chbeir, R., Huang, H., Silvestri, F., Manolopoulos, Y., Zhang, Y. (eds) Web Information Systems Engineering – WISE 2022. WISE 2022. Lecture Notes in Computer Science, vol 13724. Springer, Cham.
Iker Garcia-Ferrero, Rodrigo Agerri, German Rigau
Model and Data Transfer for Cross-Lingual Sequence Labelling in Zero-Resource Settings (2022)
Findings of the Association for Computational Linguistics: EMNLP 2022
MarÍa Jesús Aranzabe, Izaskun Aldezabal, Igone Zabala
Recursos y Herramientas de Lingüística de Corpus y PLN para la Monitorización e Investigación de los Usos Académicos del Euskera (2022)
III. workshop de INTELE (Infraestructura de Tecnologías del Lenguaje). Madrid, 13 y 14 de septiembre (Workshop horretan aurkeztutako posterra)
María Jesús Aranzabe, Antton Gurrutxaga, Igone Zabala
Compilación del corpus académico de noveles en euskera HARTAeus y su explotación para el estudio de la fraseología académica (2022)
Procesamiento del Lenguaje Natural, Revista no 69, septiembre de 2022, pp. 95-103
Margarita Alonso Ramos, Igone Zabala
HARTAes-vas: Lexical combinations for an academic writing aid tool in Spanish and Basque (2022)
SEPLN-PD 2022. Annual Conference of the Spanish Association for Natural Language Processing 2022: Projects and Demonstrations, September 21-23, 2022, A Coruña, España.
Elisa Sanchez-Bayona, Rodrigo Agerri
From Automatic Metaphor Processing in Spanish to a Multilingual Perspective: Annotation, Systems, and Evaluation (2022)
Doctoral Symposium on Natural Language Processing from the PLN.net network 2022 (RED2018-102418-T), 21-23 September 2022, A Coruña, Spain.
Elisa Sanchez-Bayona, Rodrigo Agerri
Leveraging a New Spanish Corpus for Multilingual and Crosslingual Metaphor Detection (2022)
Proceedings of the 26th Conference on Computational Natural Language Learning (CoNLL), pages 228--240, Abu Dhabi, United Arab Emirates, Association for Computational Linguistics.
Itziar Aldabe, Jane Dunne, Aritz Farwell, Owen Gallagher, Federico Gaspari, Maria Giagkou, Jan Hajic, Jens Peter Kückens, Teresa Lynn, Georg Rehm, German Rigau, Katrin Marheinecke, Stelios Piperidis, Natalia Resende, Tea Vojtěchová, Andy Way
Overview of the ELE Project (2022)
Proceedings of the 23rd Annual Conference of the European Association for Machine Translation, p. 353–354
Itziar Aldabe, Aritz Farwell, Eva Navas, Inma Hernaez, German Rigau
ELE Project: an overview of the desk research (2022)
Proc. IberSPEECH 2022, 231-234
A Garcia Olea, I Valdelvira Vazquez, I Diez Gonzalez, A Atutxa Salazar, K Gojenola Galletebeitia, J M Ormaetxe Merodio
Prediction of new onset atrial fibrillation recurrence or persistence with artificial intelligence: first insights of the PRAFAI study (2022)
European Heart Journal - Digital Health, Volume 3, Issue 4, December 2022,
Jose Mari Arriola
Nolakoa izango da komunikazioa Gizaki-Makina Aroan? (animazioak eta inkesta) (2022)
IRAKASBIL
Jose Mari Arriola
MACHINE TRANSLATION AS AN AID FOR WRITING BY COMPUTER SCIENCE UNIVERSITY STUDENTS (2022)
15th annual International Conference of Education, Research and Innovation, 7-9 November, 2022 Seville, Spain
Oscar Cumbicus-Pineda, Iker Gutiérrez-Fandiño, Itziar Gonzalez-Dios, Aitor Soroa
Noisy Channel for Automatic Text Simplification (2022)
Cumbicus-Pineda, O. M., Gutiérrez-Fandiño, I., Gonzalez-Dios, I., & Soroa, A. (2022). Noisy Channel for Automatic Text Simplification. arXiv preprint arXiv:2211.03152.
Scao, T. L., Fan, A., Akiki, C., Pavlick, E., Ilić, S., Hesslow, D., Soroa, A., Gonzalez-Dios, I,... & Manica, M.
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model (2022)
Scao, T. L., Fan, A., Akiki, C., Pavlick, E., Ilić, S., Hesslow, D., ... & Manica, M. (2022). BLOOM: A 176B-Parameter Open-Access Multilingual Language Model. arXiv preprint arXiv:2211.05100.
Nora Hollenstein, Itziar Gonzalez-Dios, Lisa Beinborn, and Lena Jäger
Patterns of text readability in human and predicted eye movements (2022)
Nora Hollenstein, Itziar Gonzalez-Dios, Lisa Beinborn, and Lena Jäger. 2022. Patterns of Text Readability in Human and Predicted Eye Movements. In Proceedings of the Workshop on Cognitive Aspects of the Lexicon, pages 1–15, Taipei, Taiwan. Association for Computational Linguistics.
Petter Mæhlum, Andre Kåsen, Samia Touileb, and Jeremy Barnes.
Annotating Norwegian language varieties on Twitter for Part-of-speech. (2022)
Proceedings of the Ninth Workshop on NLP for Similar Languages, Varieties and Dialects
David Samuel, Jeremy Barnes, Robin Kurtz, Stephan Oepen, Lilja Øvrelid, and Erik Velldal
Direct Parsing to Sentiment Graphs (2022)
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages: 470–478
Mikel Osinalde, Mikel Iruskieta
Hizkuntza ikasleen testu corpus etiketatuaren analisia eta interpretazioa B2 eta C1 mailetan (2022)
eHizpide 100
Mikel Iruskieta, Mari Mar Boillos
Aproximación al género Trabajo de Fin de Grado en euskera: hacia una identificación de las características lingüístico-discursivas (2022)
In Elena Alarcón, José Sanchez-Santamaria, Purificación Cruz (Coord.) Nuevos contenidos para una nueva docencia, 283-296
Xabier Soto, Olatz Pérez-de-Viñaspre, Maite Oronoz, Gorka Labaka
Development of a Machine Translation system for promoting the use of a low resource language in the clinical domain: the case of Basque. (2022)
Chapter 7 In Natural Language Processing In Healthcare A Special Focus on Low Resource Languages. Routledge, Taylor & Francis Group.
Bernardo Magnini, Begoña Altuna, Alberto Lavelli, Anne-Lyse Minard, Manuela Speranza, and Roberto Zanoli
European Clinical Case Corpus (2022)
Bernardo Magnini, Begoña Altuna, Alberto Lavelli, Anne-Lyse Minard, Manuela Speranza, and Roberto Zanoli (2022). European Clinical Case Corpus. Georg Rehm ed. European Language Grid, A Language Technology Platform for Multilingual Europe. Springer, Cham, Switzerland. https://doi.org/10.1007/978-3-031-17258-8
Gildo Fabregat Ander Cejudo Juan Martinez-Romo Alicia Pérez Lourdes Araujo Nuria Lebeña Maite Oronoz Arantza Casillas
Approximate Nearest Neighbour Extraction Techniques and Neural Networks for Suicide Risk Prediction in the CLPsych 2022 Shared Task (2022)
CLPsych 2022 Shared Task, Accepted in CLPsych 2022 Shared Task, July 15th 2022
E Agirre, M Apidianaki, I Vulić
Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures (2022)
Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures. Association for Computational Linguistics, Dublin, Ireland
Oscar Sainz, Haoling Qiu, Oier Lopez de Lacalle, Eneko Agirre, Bonan Min
ZS4IE: A toolkit for Zero-Shot Information Extraction with simple Verbalizations (2022)
In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Demonstrations, Seattle, Washington. Association for Computational Linguistics.
Oscar Sainz, Itziar Gonzalez-Dios, Oier Lopez de Lacalle, Bonan Min, Eneko Agirre
Textual Entailment for Event Argument Extraction: Zero- and Few-Shot with Multi-Source Learning (2022)
In Findings of the Association for Computational Linguistics: NAACL 2022, Seattle, Washington. Association for Computational Linguistics.
Jon Alkorta, Mikel Iruskieta
Adding the Basque Parliament Corpus to ParlaMint Project (2022)
ParlaCLARIN III at LREC2022 - Workshop on Creating, Enriching and Using Parliamentary Corpora: 107–110
Ibarra, I. eta Iruskieta M.
Corpus lingüísticos, smartpen y whatsapp: Intervención en escritura de una madre con sus hijos (2022)
IV Congreso internacional en Inclusión Social y Educativa: CIISE
Irune Ibarra, Mikel Iruskieta
Disgrafia hobetzeko esku-hartzea idazkailu digitala erabiliz (2022)
UZTARO 121, 155-178
Mikel Iruskieta
Herramientas Digitales para las Humanidades Digitales en la e-infraestructura CLARIN (2022)
Creación de un proyecto en humanidades digitales basado en el análisis de textos: modelado y procesamiento
Harritxu Gete, Thierry Etchegoyhen, David Ponce, Gorka Labaka, Nora Aranberri, Ander Corral, Xabier Saralegi, Igor Ellakuria and Maite Martin
TANDO: A Corpus for Document-level Machine Translation. (2022)
Proceedings of the 13th Edition of the Language Resources and Evaluation Conference (LREC 2022)
Aitor Ormazabal, Mikel Artetxe, Aitor Soroa, Gorka Labaka, Eneko Agirre
Principled Paraphrase Generation with Parallel Corpora (2022)
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1621-1638
Owen Trigueros, Alberto Blanco, Nuria Lebeña, Arantza Casillas, Alicia Pérez
Explainable ICD multi-label classification of EHRs in Spanish with convolutional attention (2022)
International Journal of Medical Informatics
Nuria Lebeña, Alberto Blanco, Alicia Pérez, Arantza Casillas
Preliminary exploration of topic modelling representations for Electronic Health Records coding according to the International Classification of Diseases in Spanish (2022)
Expert Systems with Applications
Alberto Blanco, Sonja Remmer, Alicia Pérez, Hercules Dalianis, Arantza Casillas
Implementation of specialised attention mechanisms: ICD-10 classification of Gastrointestinal discharge summaries in English, Spanish and Swedish (2022)
Journal of Biomedical Informatics
Itxaso Alayo, Ander Merketegi, Maite Oronoz, Arantza Casillas, Alicia Pérez, Olatz Garin, Isabel Moreira, Montse Ferrer, Jordi Alonso, Yolanda Pardo
A baseline model for the automation of the systematic review of Patient-Reported Outcomes measures: the case of the BiblioPRO virtual library (2022)
Jornada científica CIBERESP 2022 (https://jornadacientifica.ciberesp.es/). Centro de Investigación Biomédica en Red, Epidemiología y Salud Pública.
Alberto Blanco, Alicia Pérez, Arantza Casillas
Exploiting ICD Hierarchy for Classification of EHRs in Spanish Through Multi-Task Transformers (2022)
IEEE Journal of Biomedical and Health Informatics
Arantxa Otegi, Iñaki San Vicente, Xabier Saralegi, Anselmo Peñas, Borja Lozano, Eneko Agirre
Information retrieval and question answering: A case study on COVID-19 scientific literature (2022)
Knowledge-Based Systems, Volume 240.