The database contains audio and video material related to traditional culture - songs, folktales, legends, life stories and various collective or individual folklore related performances. The content has been either specifically contributed to the Archives of Latvian Folklore or collected by its staff members.
Morphologically tagged and lemmatized text sample (> 16 000 running words), publicly available via Bonito interface and http://www.korpuss.lv/uzzinas/plans_ledus.pdf
Latvian fairytales and legends collected by Latvian folklorist Pēteris Šmits, published 1927-1938 (15 volumes). It is the largest published collection of Latvian folktales and legends.
Its aim is to ensure digitising the collections of the National Library of Latvia and other similar organisations, by making them accessible on the Internet. The creation of the digital library lays the foundation for uniform principles of processing, storing the digitised materials and ensuring access to them.
HMM-based tagger of Latvian texts. The tagger uses information from SemTi-Kamols morphological analyser, the tagset is derived from MULTEXT-East project.
The lifestory is a source for qualitative research. The most basic component of the collection is the written or recorded document of personal history, a short or lengthy story of a person's life and observations. National Oral History Project (Nacionālās mutvārdu vēstures projekts (NMV)) has been financed by the Science Council of Latvia (Latvijas Zinātnes Padome) since 1992. Its approach is multidisciplinary, employing sociological and philosophical theories in particular.
The dictionary is based on Lithuanian-Latvian dictionary (1995) by Jons Balkevičs, Laimute Balode, Apolonija Bojāte, Valters Subatnieks, ed. by Alberts Sarkanis. It contains ca. 60 00 lexical entries, inclusion of morphlogical analysis tools allows search for word forms.
A standards compliant RESTful web service, based on the lexicon of the Dictionary of the Standard Latvian Language. The morphological database contains 57 613 lemmas (1 332 889 word forms).
Diachronic Corpus of Early Written Latvian Texts (16-18th c.). > 1 mill. running words (work is on-going). The main data are ecclesiastical texts, secular texts (laws, fiction) and some first bilingual (Latvian-German) dictionaries. A KWIC-based concordancer, as well as inverse vocabulary, frequency lists and word lists are provided. Some source facsimiles are available.