publications
Itziar Gonzalez-Dios
Data statement of the Corpus of Basque Simplified Texts (2020)
file2 (2020)
file3 (2020)
Data Statements workshop
María Espinosa, Rodrigo Agerri, Roberto Centeno, Alvaro Rodrigo
DeepReading@SardiStance:Combining Textual, Social and Emotional Features. (2020)
Proceedings of the Seventh Evaluation Campaign of Natural Language Processing and Speech Tools for Italian (EVALITA 2020). Winners of the
SardiStance@Evalita2020 shared task
Rodrigo Agerri, German Rigau
Projecting Heterogeneous Annotations for Named Entity Recognition (2020)
In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2020). Winner of the
CAPITEL@IberLEFtask on Spanish NER.
Xabier Soto, Olatz Perez-de-Viñaspre, Gorka Labaka, Maite Oronoz
Ixamed's submission description for WMT20 Biomedical shared task: benefits and limitations of using terminologies for domain adaptation (2020)
Proceedings of the Fifth Conference on Machine Translation, pp: 873--878.
Iker de la Iglesia, Mikel Martinez-Puente, Alexander Platas, Iria San Miguel, Aitziber Atutxa, Koldo Gojenola
MEDIA team at the CLEF-2020 MultilingualInformation Extraction Task (2020)
Working Notes of CLEF 2020 - Conference and Labs of the Evaluation Forum Thessaloniki, Greece, September 22-25, 2020.
Kepa Sarasola, Itziar Aldabe, Arantza Diaz de Ilarraza, Ainara Estarrona, Aritz Farwell, Inma Hernaez, Eva Navas; Reviewers: Annika Grützner-Zahn, Maria Giagkou; Editors: Maria Giagkou, Stelios Piperidis, Georg Rehm, Jane Dunne
Report on the Basque Language. European Language Equality (2020)
Deliverables of the Project ELE (European Language Equality). D1.4 Report on the Basque Language, https://european-language-equality.eu/deliverables/
Begoña Altuna
Análisis de estructuras temporales en euskera y creación de un corpus (2020)
Procesamiento del Lenguaje Natural, Revista no 64, marzo de 2020, pp. 131-134 URL: http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/view/6206 ISSN: 1989-7553
Uxoa Inurrieta, Itziar Aduriz, Arantza Díaz de Ilarraza, Gorka Labaka, Kepa Sarasola
Learning about phraseology from corpora: A linguistically motivated approach for Multiword Expression identification. (2020)
Inurrieta U, Aduriz I, Díaz de Ilarraza A, Labaka G, Sarasola K (2020) Learning about phraseology from corpora: A linguistically motivated approach for Multiword Expression identification. PLoS ONE 15(8): e0237767. https://doi.org/10.1371/journal.pone.0237767
Ainara Estarrona, Izaskun Aldezabal, Arantza Díaz de Ilarraza
How the corpus-based Basque Verb Index lexicon was built (2020)
Language Resources and Evaluation. First Online 05 December 2018. DOI: https://doi.org/10.1007/s10579-018-9440-0. Springer Netherlands
Itziar Aduriz, Jose Mari Arriola, Xabier Artola, Zuhaitz Beloki, Nerea Ezeiza, Koldo Gojenola
Morfeus+: Word Parsing in Basque beyond Morphological Segmentation (2020)
WORD STRUCTURE 13.3, 283-315
Nora Aranberri
Can translationese features help users select an MT system for post-editing? (2020)
Revista Procesamiento del Lenguaje Natural, 64, 93-100.
Sara Santiso, Alicia Pérez, Arantza Casillas, Maite Oronoz
Neural negated entity recognition in Spanish electronic health records (2020)
Journal of Biomedical Informatics (JBI) https://doi.org/10.1016/j.jbi.2020.103419
Alberto Blanco, Alicia Pérez, Arantza Casillas
Extreme multi-label ICD classification: sensitivity to hospital service and time (2020)
IEEE Access, Volume 8, 183534-183545
Itziar Gonzalez-Dios, Kepa Bengoetxea, Amaia Aguirregoitia
LagunTest: A NLP Based Application to Enhance Reading Comprehension (2020)
1st Workshop on Tools and Resources to Empower People with REAding DIfficulties (READI2020), pages 63–69. ISBN: 979-10-95546-44-3 https://www.aclweb.org/anthology/2020.readi-1.10/ https://lrec2020.lrec-conf.org/media/proceedings/Workshops/Books/READI2020book.pdf
Kepa Bengoetxea, Itziar Gonzalez-Dios, Amaia Aguirregoitia
AzterTest: Open source linguistic and stylistic analysis tool (2020)
Procesamiento del Lenguaje Natural, 64, 61-68. http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/view/6196
Alberto Blanco, Alicia Pérez, Arantza Casillas, Daniel Cobos
Extracting Cause of Death from Verbal Autopsy with Deep Learning interpretable methods (2020)
IEEE Journal of Biomedical and Health Informatics
Javier Álvez, Itziar Gonzalez-Dios, German Rigau
Applying the Closed World Assumption to SUMO-based FOL Ontologies for Effective Commonsense Reasoning (2020)
file2 (2020)
Frontiers in Artificial Intelligence and Applications. Giuseppe De Giacomo, Alejandro Catala, Bistra Dilkina, Michela Milano, Senén Barro, Alberto Bugarín, Jérôme Lang (eds.). Volume 325: ECAI 2020. Pages 585 - 592. IOS Press Ebooks
Kepa Sarasola, Iñaki Alegria, Olatz Perez de Viñaspre
Language Technology for Language Communities: An Overview based on Basque Experience 2020 (2020)
file2 (2020)
Symposiwm Academaidd Technolegau Iaith Cymru 2020 -11-04 // Wales Academic Symposium on Language Technologies 2020-11-04
Itziar Aldabe, Josu Aztiria, Francho Beltrán, Myriam Bras, Klara Ceberio, Itziar Cor tes, Jean-Baptiste Coyos, Benaset Dazeas, Louise Esher, Gorka Labaka, Igor Leturia, Kepa Sarasola, Aure Séguier, Jean Sibille
LINGUATEC: Development of cross-border cooperation and knowledge transfer in language technologies (2020)
Workshop "INTELE : INfraestructura de TEcnologías del LEnguaje" CLARIN DARIAH-EU. http://ixa2.si.ehu.eus/intele/?q=node/71
Camacho A., Iruskieta M., Latatu A., Lonbide P.
UEUren Online ikaskuntzarako eredu pedagogikoaren sorrera eta garapena (2020)
UZTARO 118, 5-38
Perez, N; Accuosto, P; Bravo, A; Quadres, M; Martinez-Garcia, E; Saggion, H; Rigau, G.
Cross-lingual semantic annotation of Biomedical literature: experiments in Spanish and English (2020)
Bioinformatics, 36, 6, 1872-1880. , ISSN 1367-1880
Itziar Gonzalez-Dios, Javier Álvez, German Rigau
Towards modeling SUMO attributes through WordNet adjectives: a Case Study on Qualities. (2020)
Proceedings of the Workshop on Multimodal Wordnets (MMWN-2020), pages 1–6. ISBN: 979-10-95546-41-2 https://lrec2020.lrec-conf.org/media/proceedings/Workshops/Books/MMW2020book.pdf
Lima S., Pérez-Miguel N., Cuadros M. and Rigau G.
NUBes: A Corpus of Negation and Uncertainty in Spanish Clinical Texts. (2020)
Proceedings of the 12th Language Resources and Evaluation Conference (LREC'20). Marseille, France. 2020.
Ainara Estarrona, Izaskun Etxeberria, Ricardo Etxepare, Manuel Padilla-Moyano, Ander Soraluze
Dealing with dialectal variation in the construction of the Basque historical corpus (2020)
Proceedings of the 7th Workshop on NLP for similar languages, varieties and dialects (VarDial2020 at COLING 2020).
Bernardo Magnini, Begoña Altuna, Alberto Lavelli, Manuela Speranza, Roberto Zanoli
The E3C Project:Collection and Annotation of a Multilingual Corpus of Clinical Cases (2020)
In Johanna Monti, Felice Dell'Orletta and Fabio Tamburini (eds.), Proceedings of the Seventh Italian Conference on Computational Linguistics. Associazione Italiana di Linguistica Computazionale. Bologna, Italy, 2020.
Unai Atutxa, Mikel Iruskieta, Olatz Ansa
Laburpena eskolan: estrakzioaren eta abstrakzioaren arteko zubia eskolan (2020)
Hizkuntzaren eta Literaturaren Didaktika testuinguru eleaniztunetan: Hizkuntzaren eta Literaturaren Didaktikaren Nazioarteko XX. Kongresuko aktak. 57-66.
Eneko Agirre, Marianna Apidianaki, Ivan Vulić (Editors)
Proceedings of Deep Learning Inside Out (DeeLIO): The First Workshop on Knowledge Extraction and Integration for Deep Learning Architectures (2020)
In conjunction with EMNLP. Association for Computational Linguistics
Ainara Estarrona, Izaskun Etxeberria, Ricardo Etxepare, Manuel Padilla-Moyano, Ander Soraluze
Sintaktikoki etiketatutako euskarazko corpus historikoa eraikitzen (2020)
Fontes Linguae Vasconum 50 urte. Ekarpen berriak euskararen ikerketari. Nuevas aportaciones al estudio de la lengua vasca
Rodrigo Agerri, Iñaki San Vicente, Jon Ander Campos, Ander Barrena, Xabier Saralegi, Aitor Soroa, Eneko Agirre
Give your Text Representation Models some Love: the Case for Basque (2020)
Proceedings of LREC. Also available at arxiv https://arxiv.org/pdf/2004.00033.pdf
C. Pradel, D. Sileo, A. Rodrigo, A. Peñas, E. Agirre.
Question Answering when Knowledge Bases are Incomplete? (2020)
Proceedings of Conference and Labs of the Evaluation Forum.
Piroska Lendvai , Sándor Darányi, Christian Geng, Moniek Kuijpers, Oier Lopez de Lacalle , Jean-Christophe Mensonides, Simone Rebora and Uwe Reichel
Detection of Reading Absorption in User-Generated Book Reviews: Resources Creation and Evaluation (2020)
Proceeding of 12th Edition of its Language Resources and Evaluation Conference (LREC2020). Marseille, France
Oscar Sainz, Oier Lopez de Lacalle, Itziar Aldabe, Montse Maritxalar
Domain Adapted Distant Supervision for Pedagogically Motivated Relation Extraction (2020)
Proceeding of 12th Edition of its Language Resources and Evaluation Conference (LREC2020). Marseille, France
Andrea Horbach, Itziar Aldabe, Marie Bexte, Oier Lopez de Lacalle and Montse Maritxalar
Linguistic Appropriateness and Pedagogic Usefulness of Reading Comprehension Questions (2020)
Proceeding of 12th Edition of its Language Resources and Evaluation Conference (LREC2020). Marseille, France
Javier Álvez, Itziar Gonzalez-Dios, German Rigau
Towards Word Sense Disambiguation by Reasoning (2020)
Vampire 2018 and Vampire 2019. The 5th and 6th Vampire Workshops. EPiC Series in Computing. Pages 19-29. ISSN: 2398-7340
Jon Ander Campos, Kyunghyun Cho, Arantxa Otegi, Aitor Soroa, Eneko Agirre, Gorka Azkune
Improving Conversational Question Answering Systems after Deployment using Feedback-Weighted Learning (2020)
Proceedings of the 28th International Conference on Computational Linguistics (COLING), pages 2561–2571. Outstanding Paper (Top 1%).
Arantxa Otegi, Jon Ander Campos, Gorka Azkune, Aitor Soroa, Eneko Agirre
Automatic Evaluation vs. User Preference in Neural Textual Question Answering over COVID-19 Scientific Literature (2020)
Proceedings of the 1st Workshop on NLP for COVID-19 (Part 2) at EMNLP 2020
Jon Ander Campos, Arantxa Otegi, Aitor Soroa, Jan Deriu, Mark Cieliebak, Eneko Agirre
DoQA - Accessing Domain-Specific FAQs via Conversational QA (2020)
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7302–7314
Arantxa Otegi, Aitor Agirre, Jon Ander Campos, Aitor Soroa, Eneko Agirre
Conversational Question Answering in Low Resource Scenarios: A Dataset and Case Study for Basque (2020)
Proceedings of The 12th Language Resources and Evaluation Conference, pp. 429–435. European Language Resources Association. ISBN: 979-10-95546-34-4
Oier Lopez de Lacalle, Ander Salaberria, Aitor Soroa, Gorka Azkune and Eneko Agirre
Evaluating Multimodal Representations on Visual Semantic Textual Similarity (2020)
Proceedings of the Twenty-third European Conference on Artificial Intelligence, ECAI 2020, June 8-12, 2020, Santiago Compostela, Spain
Gorka Urbizu, Ander Soraluze, Olatz Arregi
Sequence to Sequence Coreference Resolution (2020)
Proceedings of the 3rd Workshop on Computational Models of Reference, Anaphora and Coreference (CRAC 2020), pages 39–46,Barcelona, Spain (online), December 12, 2020.
Rachel Bawden, Giorgio Maria Di Nunzio, Cristian Grozea, Inigo Jauregi Unanue, Antonio Jimeno Yepes, Nancy Mah, David Martinez, Aurélie Névéol, Mariana Neves, Maite Oronoz, Olatz Perez-de-Viñaspre, Massimo Piccardi, Roland Roller, Amy Siu, Philippe Thomas, Federica Vezzani, Maika Vicente Navarro, Dina Wiemann and Lana Yeganova
Findings of the WMT 2020 Biomedical Translation Shared Task: Basque, Italian and Russian as New Additional Languages (2020)
Fith Conference on Machine Translation (WMT20). Shared Task: Biomedical Translation Task
Xabier Soto, Dimitar Shterionov, Alberto Poncelas, Andy Way
Selecting Backtranslated Data from Multiple Sources for Improved Neural Machine Translation (2020)
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp: 3898–3908.
Jan Deriu, Don Tuggener, Pius von Däniken, Jon Ander Campos, Alvaro Rodrigo, Thiziri Belkacem, Aitor Soroa, Eneko Agirre, Mark Cieliebak
Spot The Bot: A Robust and Efficient Framework for the Evaluation of Conversational Dialogue Systems (2020)
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). (Pages 3971–3984). Honorable Mention Paper (Top 1%).
Juan J. Lastra-Díaz, Josu Goikoetxea, Mohamed Ali Hadj Taieb, Ana Garcia-Serrano, Mohamed Ben Aouicha, Eneko Agirre, David Sánchez
A large reproducible benchmark of ontology-based methods and word embeddings for word similarity (2020)
Information Systems. Online first.
Mikel Artetxe, Gorka Labaka, Eneko Agirre
Translation Artifacts in Cross-lingual Transfer Learning (2020)
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). (Pages 7674–7684).
Uxoa Iñurrieta
Identification and translation of verb+noun multiword expressions: a Spanish-Basque study (2020)
Procesamiento del Lenguaje Natural, 64, pp. 123-126.
Itziar Aduriz, Jose Mari Arriola
Testu-corpusen informazio morfosintaktikoaren etiketatze automatikoa hizkuntz ezagutzan oinarrituz: zenbait arazo, hainbat erronka (2020)
Fontes Linguae Vasconum 50 urte. Ekarpen berriak euskararen ikerketari / Nuevas aportaciones al estudio de la lengua vasca.
Alberto Blanco, Alicia Pérez, Arantza Casillas
Automatic Classification of Medical Records with Multi-label Classifiers and Similarity Match Coders (2020)
CEUR Workshop Proceedings, Vol 2696 - Working Notes of CLEF 2020 - Conference and Labs of the Evaluation Forum
Alberto Blanco, Olatz Perez de Viñaspre, Alicia Pérez, Arantza Casillas
Boosting ICD multi-label classification of health records with contextual embeddings and label-granularity (2020)
Computer Methods and Programs in Biomedicine, Volume 188, 105264
Santana, S and Pérez, A and Casillas, A
HapLap at eHealth-KD Challenge 2020 (2020)
Proceedings of the Iberian Languages Evaluation Forum co-located with 36th Conference of the Spanish Society for Natural Language Processing, IberLEF@ SEPLN
Hormaetxe G., Iruskieta M.
Parekoen behaketarekin komunikazio-gaitasuna ebaluatzen: zer dute nahiago ikasleek, errubrika tradizionala ala bideo-behaketa (2020)
e-Hizpide 96
Mikel Artetxe, Gorka Labaka, Noe Casas, Eneko Agirre
Do all roads lead to Rome? Understanding the role of initialization in iterative back-translation (2020)
Knowledge-Based Systems, Volume 206 (online first). Pre-print https://arxiv.org/abs/2002.12867
Eneko Agirre
Cross-Lingual Word Embeddings (Book Review) (2020)
Computational Linguistics 46 (1), 245-248. (https://doi.org/10.1162/COLI_r_00372)
Jan Deriu, Katsiaryna Mlynchyk, Philippe Schläpfer, Alvaro Rodrigo, Dirk von Grünigen, Nicolas Kaiser, Kurt Stockinger, Eneko Agirre, Mark Cieliebak
A Methodology for Creating Question Answering Corpora Using Inverse Data Annotation (2020)
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 897-911.
Ivana Kvapilíková, Mikel Artetxe, Gorka Labaka, Eneko Agirre, Ondřej Bojar
Unsupervised Multilingual Sentence Embeddings for Parallel Corpus Mining (2020)
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop. Pages 255-262
Gorka Azkune, Aitor Almeida, Eneko Agirre
Cross-environment activity recognition using word embeddings for sensor and activity representation (2020)
Neurocomputing (available online 1 September 2020)
José Ramom Pichel, Pablo Gamallo, Marco Neves & Iñaki Alegria
Distância diacrónica automática entre variantes diatópicas do português e do espanhol (2020)
Linguamática, Vol. 12 N. 1, 117–126 ISSN: 1647–0818
Elena Zotova, Rodrigo Agerri, Manuel Nuñez and German Rigau
Multilingual Stance Detection in Tweets: The Catalonia Independence Corpus (2020)
Language Resources and Evaluation Conference (LREC 2020)
Rodrigo Agerri, German Rigau
Language independent sequence labelling for Opinion Target Extraction (2020)
International Joint Conference on Artificial Intelligence (IJCAI 2020)
Nora Aranberri
With or without you? Effects of using machine translation to write flash fiction in the foreign language (2020)
Proceedings of the 22nd Annual Conference of the European Association for Machine Translation, p. 165–174, Lisboa, Portugal, November 2020.
Adrián Nuñez-Marcos, Gorka Azkune, Eneko Agirre, Diego López-de-Ipiña, Ignacio Arganda-Carreras
Using External Knowledge to Improve Zero-shot Action Recognition in Egocentric Videos (2020)
International Conference on Image Analysis and Recognition (ICIAR)
Arantxa Otegi, Aitor Soroa, Eneko Agirre, Jon Ander Campos
Cómo gestionar la sobrecarga de información científica sobre COVID-19 (2020)
The Conversation. ISSN 2201-5639. https://theconversation.com/como-gestionar-la-sobrecarga-de-informacion-cientifica-sobre-covid-19-138651
Jose Mari Arriola, Josu Goikoetxea, Mikel Iruskieta
Hizkuntza-teknologiak hizkuntzen ikas-irakaskuntzan: zenbat aukera, hainbat erronka (2020)
ehizpide 95: 1--21
Thierry Declerck, Itziar Gonzalez-Dios, German Rigau (editors)
Proceedings of the LREC 2020 Workshop on Multimodal Wordnets (MMWN-2020) (2020)
European Language Resources Association (ELRA), Paris. https://lrec2020.lrec-conf.org/media/proceedings/Workshops/Books/MMW2020book.pdf ISBN: 979-10-95546-41-2 EAN: 9791095546412
Jon Alkorta, Itziar Gonzalez-Dios
Exploring the Enrichment of Basque WordNet with a Sentiment Lexicon (2020)
Proceedings of the Workshop on Multimodal Wordnets (MMWN-2020), pages 20–24. ISBN: 79-10-95546-41-2 https://lrec2020.lrec-conf.org/media/proceedings/Workshops/Books/MMW2020book.pdf
Begoña Altuna, María Jesús Aranzabe, Arantza Díaz de Ilarraza
EusTimeML: A mark-up language for temporal information in Basque (2020)
Research in Corpus Linguistics 8: 86-104. ISSN 2243-4712. Asociación Española de Lingüística de Corpus (AELINCO) DOI 10.32714/ricl.08.01.06
Mikel Artetxe, Sebastian Ruder, Dani Yogatama
On the cross-lingual transferability of monolingual representations (2020)
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Mikel Artetxe, Sebastian Ruder, Dani Yogatama, Gorka Labaka, Eneko Agirre
A Call for More Rigor in Unsupervised Cross-lingual Learning (2020)
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Pablo Gamallo José Ramom Pichel and Iñaki Alegria
Measuring Language Distance of Isolated European Languages (2020)
MDPI Information 2020, 11(4), 181 doi:10.3390/info11040181
Sara Santiso
Adverse Drug Reaction extraction on Electronic Health Records written in Spanish (2020)
Procesamiento del Lenguaje Natural http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/view/6203
Mikel Iruskieta, Amaia Arroyo-Sagasta, Abel Camacho, Montse Maritxalar
Teknologia, testuinguru digitala eta konpetentzia digitalak hezkuntzan (2020)
Euskonews 748. ISSN: 1139-3629. URL: http://www.euskonews.eus/zbk/748/teknologia-testuinguru-digitala-eta-konpetentzia-digitalak-hezkuntzan/ar-0748001002E/
Jose R. Pichel, Pablo Gamallo, Iñaki Alegria, Marco Neves
A Methodology to Measure the Diachronic Language Distance between Three Languages Based on Perplexity (2020)
Journal of Quantitative Linguistics. DOI 10.1080/09296174.2020.1732177
Rebecka Weegar, Alicia Pérez, Arantza Casillas, Maite Oronoz
Recent advances in Swedish and Spanish medical entity recognition in clinical texts using deep neural approaches (2020)
BMC Medical Informatics and Decision Making