Text Analysis

Natural Language Analysis Tools are software modules that perform linguistic analysis on texts at different levels. These tools are essential components of any Natual Language Processing (NLP) software that analyzes text, and any text mining software is typically built by combining basic linguistic modules forming complex pipelines.

The HiTZ center has a large tradition in building analysis tools for many languages, which range from basic linguistic processors such as tokenizers, Part-...Read More

More researchers

Text_analysis_tabs

Demos

Demo of the English NLP pipeline

Just copy in any English text and see what entities and events and other annotations are added automatically. The result is represented in the NAF format.

Demo of the Spanish NLP pipeline

Just copy in any Spanish text and see what entities and other annotations are added automatically. The result is represented in the NAF format.

Eustagger

Basque lemmatizer and morphosyntactic analyzer

Xuxen

Basque spelling corrector on-line

Contracts

Publications

IXA pipeline: Efficient and Ready to Use Multilingual NLP tools. (2014)

Rodrigo Agerri, Josu Bermudez, German Rigau

LREC 2014: 3823-3828. ISBN 978-2-9517408-8-4

 

A scalable architecture for data-intensive natural language processing (2017)

Zuhaitz Beloki and Xabier Artola and Aitor Soroa

Natural Language Engineering, 1-23. doi:10.1017/S1351324917000092.

 

Big data for Natural Language Processing: A streaming approach (2015)

Rodrigo Agerri, Xabier Artola, Zuhaitz Beloki, German Rigau, Aitor Soroa

Knowledge-Based Systems. http://dx.doi.org/10.1016/j.knosys.2014.11.007. Vol.79, pages 36-42.

 

A stream computing approach towards scalable NLP (2014)

Xabier Artola, Zuhaitz Beloki, Aitor Soroa

Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14). Reykjavik, Iceland. ISBN: 978-2-9517408-8-4

More publications

Projects

Patents

MALTIXA

Resources

Text_analysis_tabs_full

Demo of the English NLP pipeline

Just copy in any English text and see what entities and events and other annotations are added automatically. The result is represented in the NAF format.

Demo of the Spanish NLP pipeline

Just copy in any Spanish text and see what entities and other annotations are added automatically. The result is represented in the NAF format.

Eustagger

Basque lemmatizer and morphosyntactic analyzer

Xuxen

Basque spelling corrector on-line

IXA pipeline: Efficient and Ready to Use Multilingual NLP tools. (2014)

Rodrigo Agerri, Josu Bermudez, German Rigau

LREC 2014: 3823-3828. ISBN 978-2-9517408-8-4

 

A scalable architecture for data-intensive natural language processing (2017)

Zuhaitz Beloki and Xabier Artola and Aitor Soroa

Natural Language Engineering, 1-23. doi:10.1017/S1351324917000092.

 

Big data for Natural Language Processing: A streaming approach (2015)

Rodrigo Agerri, Xabier Artola, Zuhaitz Beloki, German Rigau, Aitor Soroa

Knowledge-Based Systems. http://dx.doi.org/10.1016/j.knosys.2014.11.007. Vol.79, pages 36-42.

 

A stream computing approach towards scalable NLP (2014)

Xabier Artola, Zuhaitz Beloki, Aitor Soroa

Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14). Reykjavik, Iceland. ISBN: 978-2-9517408-8-4

More publications

MALTIXA