EVALD 2.0 for Foreigners is a software for automatic evaluation of surface coherence (cohesion) in Czech texts written by non-native speakers of Czech.
EVALD 3.0 for Foreigners is a software for automatic evaluation of surface coherence (cohesion) in Czech texts written by non-native speakers of Czech.
EVALD 4.0 for Beginners is a software that serves for automatic evaluation of Czech texts written by non-native speakers of Czech – language beginners.
EVALD 4.0 for Foreigners is a software for automatic evaluation of surface coherence (cohesion) in Czech texts written by non-native speakers of Czech.
PAWS is a multi-lingual parallel treebank with coreference annotation. It consists of English texts from the Wall Street Journal translated into Czech, Russian and Polish. In addition, the texts are syntactically parsed and word-aligned. PAWS is based on PCEDT 2.0 and continues the tradition of multilingual treebanks with coreference annotation. PAWS offers linguistic material that can be further leveraged in cross-lingual studies, especially on coreference.
Prague Czech-English Dependency Treebank - Russian translation (PCEDT-R) is a project of translating a subset of Prague Czech-English Dependency Treebank 2.0 (PCEDT 2.0) to Russian and linguistically annotating the Russian translations with emphasis on coreference and cross-lingual alignment of coreferential expressions. Cross-lingual comparison of coreference means is currently the purpose that drives development of this corpus.
The current version 0.5 is a preliminary version, which contains (+ denotes new features):
* complete PCEDT 2.0 documents "wsj_1900"-"wsj_1949"
* Czech-English word alignment of coreferential expressions annotated manually mainly on the t-layer
+ Russian translations of the original English sentences
+ automatic tokenization, part-of-speech tagging and morphological analysis for Russian
+ automatic word alignment between all Czech and Russian words
+ manual alignment between Russian and the other two languages on possessive pronouns
The Prague Czech-English Dependency Treebank 2.0 Coref (PCEDT 2.0 Coref) is a parallel treebank building upon the original PCEDT 2.0 release and enriching it with the extended manual annotation of coreference, as well as with an improved automatic annotation of the coreferential expression alignment.
A richly annotated and genre-diversified language resource, The Prague Dependency Treebank – Consolidated 1.0 (PDT-C 1.0, or PDT-C in short in the sequel) is a consolidated release of the existing PDT-corpora of Czech data, uniformly annotated using the standard PDT scheme. PDT-corpora included in PDT-C: Prague Dependency Treebank (the original PDT contents, written newspaper and journal texts from three genres); Czech part of Prague Czech-English Dependency Treebank (translated financial texts, from English), Prague Dependency Treebank of Spoken Czech (spoken data, including audio and transcripts and multiple speech reconstruction annotation); PDT-Faust (user-generated texts). The difference from the separately published original treebanks can be briefly described as follows: it is published in one package, to allow easier data handling for all the datasets; the data is enhanced with a manual linguistic annotation at the morphological layer and new version of morphological dictionary is enclosed; a common valency lexicon for all four original parts is enclosed. Documentation provides two browsing and editing desktop tools (TrEd and MEd) and the corpus is also available online for searching using PML-TQ.