HMM-based tagger of Latvian texts. The tagger uses information from SemTi-Kamols morphological analyser, the tagset is derived from MULTEXT-East project.
A standards compliant RESTful web service, based on the lexicon of the Dictionary of the Standard Latvian Language. The morphological database contains 57 613 lemmas (1 332 889 word forms).
Diachronic Corpus of Early Written Latvian Texts (16-18th c.). > 1 mill. running words (work is on-going). The main data are ecclesiastical texts, secular texts (laws, fiction) and some first bilingual (Latvian-German) dictionaries. A KWIC-based concordancer, as well as inverse vocabulary, frequency lists and word lists are provided. Some source facsimiles are available.