MorphoDiTa: Morphological Dictionary and Tagger is an open-source tool for morphological analysis of natural language texts. It performs morphological analysis, morphological generation, tagging and tokenization and is distributed as a standalone tool or a library, along with trained linguistic models. In the Czech language, MorphoDiTa achieves state-of-the-art results with a throughput around 10-200K words per second. MorphoDiTa is a free software under LGPL license and the linguistic models are free for non-commercial use and distributed under CC BY-NC-SA license, although for some models the original data used to create the model may impose additional licensing conditions.
NameTag is an open-source tool for named entity recognition (NER). NameTag identifies proper names in text and classifies them into predefined categories, such as names of persons, locations, organizations, etc. NameTag is distributed as a standalone tool or a library, along with trained linguistic models. In the Czech language, NameTag achieves state-of-the-art performance (Straková et al. 2013). NameTag is a free software under LGPL license and the linguistic models are free for non-commercial use and distributed under CC BY-NC-SA license, although for some models the original data used to create the model may impose additional licensing conditions.
UDPipe 2 is a POS tagger, lemmatizer and dependency parser.
Compared to UDPipe 1:
- UDPipe 2 is Python-only and tested only in Linux,
- UDPipe 2 is meant as a research tool, not as a user-friendly UDPipe 1 replacement,
- UDPipe 2 achieves much better performance, but requires a GPU for reasonable performance,
- UDPipe 2 does not perform tokenization by itself – it uses UDPipe 1 for that.
UDPipe 2 is available in the udpipe-2 branch of the UDPipe repository at https://github.com/ufal/udpipe/tree/udpipe-2. It is a free software under Mozilla Public License 2.0 (http://www.mozilla.org/MPL/2.0/) and the models are free for non-commercial use and distributed under CC BY-NC-SA (http://creativecommons.org/licenses/by-nc-sa/4.0/) license, although for some models the original data used to create the model may impose additional licensing conditions.
UDPipe 2 is also available as a REST service running at https://lindat.mff.cuni.cz/services/udpipe. If you like, you can use the https://github.com/ufal/udpipe/blob/udpipe-2/udpipe2_client.py script to interact with it.