MBT is a memory-based tagger-generator and tagger in one. The tagger-generator part can generate a sequence tagger on the basis of a training set of tagged sequences; the tagger part can tag new sequences. MBT can, for instance, be used to generate part-of-speech taggers or chunkers for natural language processing.
A tool for contrasting terminological vocabularies and textual corpora. It allows controlling the presence and location of reference vocabularies in textual corpora.
Abbot of the Strahov Monastery Metod Zavoral in his study in a segment from Československý filmový týdeník (Czechoslovak Film Weekly Newsreel) 1932, issue no. 36.
All existing Middle Frisian texts are contained in the Middle Frisian corpus. The texts are tagged and lemmatised; spelling variants have been brought together.
Migrant Stories is a corpus of 1017 short biographic narratives of migrants supplemented with meta information about countries of origin/destination, the migrant gender, GDP per capita of the respective countries, etc. The corpus has been compiled as a teaching material for data analysis.
Footage of actress Míla Spazierová-Hezká at the Secondary School of Decorative Arts shown with her own portrait in a segment from Československý zvukový týdeník Aktualita (Czechoslovak Aktualita Sound Newsreel) 1942, issue no. 28.