publications

Alain García Olea, Ane García Domingo-Aldama, Marcos Merino Prado, Ignacio Díez González, Aitziber Atutxa Salazar, Josu Goikoetxea Salutregi, Koldo Gojenola Galletebeitia, Mikel Maeztu Rada, Iván Cano González, Adrián Costa Santos, Iván García Díaz, Fernando Díaz González, Irene Hernández Pérez, Uxue Millet Oyarzabal y José Miguel Ormaetxe Merodio

RENDIMIENTO DE SISTEMAS DE CHAT ALIMENTADOS CON ARTÍCULOS DE INVESTIGACIÓN EN UN ENTORNO CLÍNICO ESPECÍFICO: LA ENFERMEDAD VALVULAR CARDIACA (2024)

Revista Española de Cardiología. Rev Esp Cardiol. 2024;77 (Supl 1): 1161

Alain García Olea, Ane García Domingo-Aldama, Marcos Merino Prado, Koldo Gojenola Galletebeitia, Aitziber Atutxa Salazar, Mikel Maeztu Rada, Iván García Díaz, Adrián Costa, Iván Cano, Fernando Díaz, Irene Hernández, Uxue Millet, Ainhoa Etxenike, José Miguel Ormaetxe Merodio

RENDIMIENTO DE LAS EXPRESIONES REGULARES EN EL ANÁLISIS DE INFORMES DE ALTA PRESENTES EN LA HISTORIA CLÍNICA ELECTRÓNICA: EXPRIMIENDO LOS DATOS SECUNDARIOS (2024)

Revista Española de Cardiología. Rev Esp Cardiol. 2024;77 (Supl 1): 33

Iakes Goenaga, Aitziber Atutxa, Koldo Gojenola, Maite Oronoz, Rodrigo Agerri

Explanatory argument extraction of correct answers in resident medical exams (2024)

Artificial Intelligence in Medicine Volume 157, November 2024, 102985

Iakes Goenaga, Aitziber Atutxa, Koldo Gojenola, Maite Oronoz, Rodrigo Agerri

Explanatory argument extraction of correct answers in resident medical exams (2024)

Artificial Intelligence in Medicine, Volume 157, 2024, 102985,

Angelina McMillan-Major, Francesco De Toni, Zaid Alyafeai, Stella Biderman, Kimbo Chen, G\'{e}rard Dupont, Hady Elsahar, Chris Emezue, Alham Fikri Aji, Suzana Ili\'{c}, Nurulaqilla Khamis, Colin Leong, Maraim Masoud, Aitor Soroa, Pedro Ortiz Suarez, Daniel van Strien, Zeerak Talat, Yacine Jernite

Documenting Geographically and Contextually Diverse Language Data Sources (2024)

@article{mcmillan2024, author = {McMillan, Angelina-Major and De Francesco, Toni and Alyafeai, Zaid and Biderman, Stella and Chen Kimbo, and Dupont, G\'{e}rard and Elsahar, Hady and Emezue, Chris and Fikri Aji, Alham and Ili\'{c}, Suzana and Khamis, Nurulaqilla and Leong, Colin and Masoud, Maraim and Soroa, Aitor and Ortiz Suarez, Pedro and van Strien, Daniel and Talat, Zeerak and Jernite, Yacine, title = "{Documenting Geographically and Contextually Diverse Language Data Sources}", journal = {Northern European Journal of Language Technology (NELJT)}, volume = {10}, number = {1}, year = {2024}, issn = {2000-1533}, doi = {https://doi.org/10.3384/nejlt.2000-1533.2024.5217}, url = {https://doi.org/10.3384/nejlt.2000-1533.2024.5217} }

Xabier Larrayoz, Arantza Casillas, Maite Oronoz, Alicia Pérez

Mental Disorder Detection in Spanish: Hands on Skewed Class Distribution to Leverage Training (2024)

Accepted. MentalRiskES at IberLEF 2023: Early Detection of Mental Disorders Risk in Spanish

Aitor García-Pablos, Naiara Perez, Montse Cuadros, Jaione Bengoetxea

EuSQuAD: Automatically Translated and Aligned SQuAD2.0 for Basque (2024)

Procesamiento del Lenguaje Natural, Revista no 73, septiembre de 2024, pp. 125-137

Unai Atutxa Barrenetxea, Iker de la Iglesia, Mikel Iruskieta

IGARRITZ: el predictor de palabras para el euskera basado en la inteligencia artificial y su evaluación en el entorno escolar (2024)

III. Congreso Internacional de Nuevas Tecnologías y Tendencias en la Educación. 181-194 orr. Dykinson.

Mikel Iruskieta, Iker de la Iglesia, Unai Atutxa, Lierni Ortiz

IGARRITZ: euskarazko testu iragarpenerako web ingurune egokitua (2024)

Ekaia

Nuria Lebeña, Alicia Pérez, Arantza Casillas

Quantifying decision support level of explainable automatic classification of diagnoses in Spanish medical records (2024)

Nuria Lebeña, Alicia Pérez, Arantza Casillas, Quantifying decision support level of explainable automatic classification of diagnoses in Spanish medical records, Computers in Biology and Medicine, Volume 182, 2024, 109127, ISSN 0010-4825, https://doi.org/10.1016/j.compbiomed.2024.109127. (https://www.sciencedirect.com/science/article/pii/S0010482524012125)

Iñigo Alonso, Maite Oronoz, Rodrigo Agerri

MedExpQA: Multilingual benchmarking of Large Language Models for Medical Question Answering (2024)

Artificial Intelligence in Medicine, 2024.

Margot Madina, Itziar Gonzalez-Dios, Melanie Siegel

Towards Reliable E2R Texts: A Proposal for Standardized Evaluation Practices (2024)

Madina, M., Gonzalez-Dios, I., & Siegel, M. (2024, July). Towards reliable E2R texts: a proposal for standardized evaluation practices. In International Conference on Computers Helping People with Special Needs (pp. 224-231). Cham: Springer Nature Switzerland.

Francesca De Luca Fornaciari, Begoña Altuna, Itziar Gonzalez-Dios, Maite Melero

A Hard Nut to Crack: Idiom Detection with Conversational Large Language Models (2024)

Proceedings of the 4th Workshop on Figurative Language Processing (FigLang 2024), pages 35–44

Iñigo Alonso, Eneko Agirre, Mirella Lapata

PixT3: Pixel-based Table To Text generation (2024)

Proceedings of the 2024 Main Conference of the Association for Computational Linguistics (ACL 2024)

Ahmed Elhady, Khaled Elsayed, Eneko Agirre, and Mikel Artetxe

Improving Factuality in Clinical Abstractive Multi-Document Summarization by Guided Continued Pre-training (2024)

Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 755–761, Mexico City, Mexico. Association for Computational Linguistics.

Mikel Zubillaga, Oscar Sainz, Ainara Estarrona, Oier Lopez de Lacalle, Eneko Agirre

Event Extraction in Basque: Typologically motivated Cross-Lingual Transfer-Learning Analysis (2024)

Proceeding of The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, Turin, Italy

Eneko Agirre, Itziar Aldabe, Xabier Arregi, Mikel Artetxe, Unai Atutxa, Ekhi Azurmendi, Iker De la Iglesia, Julen Etxaniz, Victor García-Romillo, Inma Hernaez-Rioja, Asier Herranz, Mikel Iruskieta, Oier López de Lacalle, Eva Navas, Paula Ontalvilla, Aitor Ormazabal, Naiara Perez, German Rigau1 Oscar Sainz, Jon Sanchez, Ibon Saratxaga, Aitor Soroa, Christoforos Souganidis, Jon Vadillo and Aimar Zabala

IKER-GAITU: research on language technology for Basque and other low-resource languages (2024)

-

Olia Toporkov, Rodrigo Agerri

On the Role of Morphological Information for Contextual Lemmatization (2024)

Computational Linguistics (MIT Press).

Adrián Núñez-Marcos, Ignacio Arganda-Carreras

Transformer-based fall detection in videos (2024)

Núñez-Marcos, A., & Arganda-Carreras, I. (2024). Transformer-based fall detection in videos. Engineering Applications of Artificial Intelligence, 132, 107937.

Júlia Falcão, Claudia Borg, Nora Aranberri, and Kurt Abela

COMET for Low-Resource Machine Translation Evaluation: A Case Study of English-Maltese and Spanish-Basque (2024)

In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 3553–3565, Torino, Italia. ELRA and ICCL.

Nora Aranberri

Analysis of the Annotations from a Crowd MT Evaluation Initiative: Case Study for the Spanish-Basque Pair (2024)

Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 1), pages 548–559 June 24-27.

Maite Oronoz, Sara Gracia, Jose Mari González, Alicia Pérez

Suizidio-zantzuak sare sozialetan: ingelesez eta gaztelaniaz hizkuntza-ezaugarriak berdinak al dira? (2024)

EKAIA: Zientzia eta Teknologia aldizkaria. 2024ko XX alea.

Anar Yeginbergen, Maite Oronoz, Rodrigo Agerri

Argument Mining in Data Scarce Settings: Cross-lingual Transfer and Few-shot Techniques (2024)

Proceedings of the 2024 Main Conference of the Association for Computational Linguistics (ACL 2024). August 11th to 16th, 2024. Bangkok, Thailand

Julen Etxaniz, Gorka Azkune, Aitor Soroa, Oier Lacalle, Mikel Artetxe

Do Multilingual Language Models Think Better in English? (2024)

In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 550–564, Mexico City, Mexico. Association for Computational Linguistics.

Maite Heredia, Julen Etxaniz, Muitze Zulaika, Xabier Saralegi, Jeremy Barnes, Aitor Soroa

XNLIeu: a dataset for cross-lingual NLI in Basque (2024)

In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 4177–4188, Mexico City, Mexico. Association for Computational Linguistics.

Jaione Bengoetxea, Yi-Ling Chung, Marco Guerini, Rodrigo Agerri

Basque and Spanish Counter Narrative Generation: Data Creation and Evaluation (2024)

Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 2132–2141

Julen Etxaniz, Oscar Sainz, Naiara Perez Miguel, Itziar Aldabe, German Rigau, Eneko Agirre, Aitor Ormazabal, Mikel Artetxe, Aitor Soroa

Latxa: An Open Language Model and Evaluation Suite for Basque (2024)

Proceedings of the 2024 Main Conference of the Association for Computational Linguistics (ACL 2024)

Jordan Koontz, Maite Oronoz, Alicia Pérez

Ixa-Med at Discharge Me! Retrieval-Assisted Generation for Streamlining Discharge Documentation (2024)

BioNLP Discharge-Me Shared Task @ ACL

Tomaž Erjavec, Matyáš Kopp, Nikola Ljubešić, Taja Kuzman, Paul Rayson, Petya Osenova, Maciej Ogrodniczuk, Çağrı Çöltekin, Danijel Koržinek, Katja Meden, Jure Skubic, Peter Rupnik, Tommaso Agnoloni, José Aires, Starkaður Barkarson, Roberto Bartolini, Núria Bel, María Calzada, Roberts Darģis, Sascha Diwersy, Maria Gavriilidou, Ruben van Heusden, Mikel Iruskieta, Neeme Kahusk, Anna Kryvenko, Noémi Ligeti-Nagy, Carmen Magariños, Martin Mölder, Costanza Navarretta, Kiril Simov et al.

ParlaMint II: Advancing Comparable Parliamentary Corpora Across Europe (2024)

PREPRINT (Version 1) available at Research Square

Iker de la Iglesia, Unai Atutxa Barrenetxea, Lierni Ortiz Elorza, Mikel Iruskieta

IGARRITZ: prediccion de textos en euskera para la escritura con la mirada (2024)

-

Iker García-Ferrero, Rodrigo Agerri, Aitziber Atutxa Salazar, Elena Cabrio, Iker de la Iglesia, Alberto Lavelli, Bernardo Magnini, Benjamin Molinet, Johana Ramirez-Romero, German Rigau, Jose Maria Villa-Gonzalez, Serena Villata, Andrea Zaninello

MedMT5: An Open-Source Multilingual Text-to-Text LLM for The Medical Domain (2024)

Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Unai Atutxa Barrenetxea, Iker de la Iglesia, Lierni Ortiz Elorza, Mikel Iruskieta

Impacto de IGARRITZ en la producción de textos en euskera para personas con parálisis cerebral: Un estudio en entorno real (2024)

-

Janire Arana, Mikel Idoyaga, Maitane Urruela, Elisa Espina, Aitziber Atutxa, Koldo Gojenola

A Virtual Patient Dialogue System Based on Question-Answering on Clinical Records (2024)

THE 2024 JOINT INTERNATIONAL CONFERENCE ON COMPUTATIONAL LINGUISTICS, LANGUAGE RESOURCES AND EVALUATION, LREC-Coling 2024, Torino

Giulia Pensa, Begoña Altuna, and Itziar Gonzalez-Dios.

A Multi-layered Approach to Physical Commonsense Understanding: Creation and Evaluation of an Italian Dataset (2024)

Pensa, G., Altuna, B., & Gonzalez-Dios, I. (2024, May). A Multi-layered Approach to Physical Commonsense Understanding: Creation and Evaluation of an Italian Dataset. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) (pp. 819-831).

Margot Madina, Itziar Gonzalez-Dios, Melanie Siegel

A Preliminary Study of ChatGPT for Spanish E2R Text Adaptation (2024)

Madina, M., Gonzalez-Dios, I., & Siegel, M. (2024, May). A Preliminary Study of ChatGPT for Spanish E2R Text Adaptation. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) (pp. 1422-1434).

Margot Madina, Itziar Gonzalez-Dios, and Melanie Siegel.

LanguageTool as a CAT tool for Easy-to-Read in Spanish (2024)

Madina, M., Gonzalez-Dios, I., & Siegel, M. (2024, May). LanguageTool as a CAT tool for Easy-to-Read in Spanish. In Proceedings of the 3rd Workshop on Tools and Resources for People with REAding DIfficulties (READI)@ LREC-COLING 2024 (pp. 93-101).

Irune Ibarra, Asunción Martínez-Arbelaiz, Jose Mari Arriola, Mikel Iruskieta

Medidas de proceso de escritura en alumnado de 2.º de Primaria en dos momentos de la vida escolar (2024)

Título: Educación en transición: experiencias y propuestas para un mundo cambiante. Nahia Idoiaga, Noemi Serrano-Díaz y Eva Palasí (coords.) Editorial: Octaedro.

Eneko Agirre, Olatz Arbelaitz, Olatz Arregi, Gorka Azkune, Arantza Casillas, Inma Hernaez, Mikel Iruskieta, Elena Lazkano, Eva Navas, German Rigau, Roberto Santana, Aitor Soroa and Rabih Zbib

ENIA Chair in Artificial Intelligence and Language Technology (2024)

-

Maria Sierro, Begoña Altuna, Itziar Gonzalez-Dios.

Automatic Detection and Labelling of Personal Data in Case Reports from the ECHR in Spanish: Evaluation of Two Different Annotation Approaches (2024)

Sierro, M., Altuna, B., & Gonzalez-Dios, I. (2024, March). Automatic Detection and Labelling of Personal Data in Case Reports from the ECHR in Spanish: Evaluation of Two Different Annotation Approaches. In Proceedings of the Workshop on Computational Approaches to Language Data Pseudonymization (CALD-pseudo 2024) (pp. 18-24).

Irune Ibarra, Asunción Martínez, Jose Maria Arriola

Análisis de las revisiones espontáneas a nivel de palabra para la mejora de la escritura a mano en segundo de Educación Primaria (2024)

Educación en la era digital. Propuestas innovadoras para los desafíos educativos del presente y del futuro. Tirant lo Blanch, capit. 3º, pp.65-77.

Nuria Lebeña, Arantza Casillas, Alicia Pérez

Temporal Name Entity Recognition and Relation Extraction in Clinical Electronic Health Records with Span-based Entity and Relation Transformer (2024)

ICBBB '24: Proceedings of the 2024 14th International Conference on Bioscience, Biochemistry and Bioinformatics; January 2024;Pages 48–54

Oscar Sainz, Iker García-Ferrero, Rodrigo Agerri, Oier Lopez de Lacalle, German Rigau, Eneko Agirre

GoLLIE: Annotation Guidelines improve Zero-Shot Information-Extraction (2024)

The Twelfth International Conference on Learning Representations

Suna Şeyma Uçar, Itziar Aldabe, Nora Aranberri, Ana Arruarte

Exploring Automatic Readability Assessment for Science Documents within a Multilingual Educational Context (2024)

Uçar, SŞ., Aldabe, I., Aranberri, N. et al. Exploring Automatic Readability Assessment for Science Documents within a Multilingual Educational Context. Int J Artif Intell Educ (2024). https://doi.org/10.1007/s40593-024-00393-2

Nora Aranberri, Uxoa Iñurrieta

When minoritized languages encounter MT: perceptions and expectations of the Basque community (2024)

Aranberri, N., & Iñurrieta, U. (2024). When minoritized languages encounter MT: perceptions and expectations of the Basque community. The Journal of Specialised Translation, (41), 179-205. Available at: https://www.jostrans.org/article/view/4718/4237

Gorka Azkune, Ander Salaberria, Eneko Agirre

Grounding spatial relations in text-only language models (2024)

Neural Networks. Volume 170, February 2024, Pages 215-226