Representative corpus of contemporary written Czech sized 100 MW. It was created as a representation of printed language from 2010–2014 containing a wide range of text types (fiction, professional literature, newspapers etc.). The corpus is lemmatized, morphologically and syntactically annotated by a combination of stochastic and rule-based methods. The corpus is provided in a (semi-XML) vertical format used as an input to the Manatee query engine. The data thus correspond to the corpus available via the KonText query interface to registered users of the CNC with one important exception: they are shuffled, i.e. divided into blocks sized max. 100 words (respecting the sentence boundaries) with ordering randomized within the given document.
The Thesaurus linguae Latinae is the first comprehensive dictionary of ancient Latin;
• it is compiled on the basis of all Latin texts surviving from antiquity (until AD 600), both literary and non-literary
• for less common words it cites every attestation, for the rest (those marked with an asterisk) an instructive and representative sample
• it records all meanings (including technical usages) and all constructions
• it documents peculiarities of inflection, spelling, and prosody
• it supplies information about the etymology of the Latin words and their survival in the Romance languages, contributed by recognised authorities in the fields of Indo-European and Romance studies
• it collects the comments of ancient sources on the word in question
The Thesaurus therefore offers for every Latin word a comprehensive, richly documented picture of its possibilities and history – not only for Latin scholars, but also for scholars of the various branches of ancient studies and for related disciplines.