Syntactic (including deep-syntactic - tectogrammatical) annotation of user-generated noisy sentences. The annotation was made on Czech-English and English-Czech Faust Dev/Test sets.
The English data includes manual annotations of English reference translations of Czech source texts. This texts were translated independently by two translators. After some necessary cleanings, 1000 segments were randomly selected for manual annotation. Both the reference translations were annotated, which means 2000 annotated segments in total.
The Czech data includes manual annotations of Czech reference translations of English source texts. This texts were translated independently by three translators. After some necessary cleanings, 1000 segments were randomly selected for manual annotation. All three reference translations were annotated, which means 3000 annotated segments in total.
Faust is part of PDT-C 1.0 (http://hdl.handle.net/11234/1-3185).
The SynSemClass synonym verb lexicon version 4.0 investigates, with respect to contextually-based verb synonymy, semantic ‘equivalence’ of Czech, English, and German verb senses and their valency behavior in parallel Czech-English and German-English language resources. SynSemClass 4.0 is a multilingual event-type ontology based on classes of synonymous verb senses, complemented with semantic roles and links to existing semantic lexicons. The version 4.0 is not only enriched by an additional number of classes but in the context of content hierarchy, some classes have been merged. Compared to the older versions of the lexicon, the novelty is the definitions of classes and the definitions of roles.
Czech lexicon entries are linked to PDT-Vallex (http://hdl.handle.net/11858/00-097C-0000-0023-4338-F), Vallex (http://hdl.handle.net/11234/1-3524), and CzEngVallex (http://hdl.handle.net/11234/1-1512). The English lexicon entries are linked to EngVallex (http://hdl.handle.net/11858/00-097C-0000-0023-4337-2), CzEngVallex (http://hdl.handle.net/11234/1-1512), FrameNet (https://framenet.icsi.berkeley.edu/fndrupal/), VerbNet (https://uvi.colorado.edu/ and http://verbs.colorado.edu/verbnet/index.html), PropBank (http://propbank.github.io/), Ontonotes (http://clear.colorado.edu/compsem/index.php?page=lexicalresources&sub=ontonotes), and English Wordnet (https://wordnet.princeton.edu/). The German lexicon entries are linked to Woxikon (https://synonyme.woxikon.de), E-VALBU (https://grammis.ids-mannheim.de/verbvalenz), and GUP (http://alanakbik.github.io/multilingual.html; https://github.com/UniversalDependencies/UD_German-GSD).
The SynSemClass synonym verb lexicon version 5.0 is a multilingual resource that enriches previous editions of this event-type ontology with a new language, Spanish. The existing languages, English, Czech and German, are further substantially extended by a larger number of classes. SSC 5.0 data also contain lists (in a separate removed_cms.zip file) with originally (pre-)proposed but later rejected class members. All languages are organized into classes and have links to other lexical sources. In addition to the existing links, links to Spanish sources have been added.
The Spanish entries are linked to
ADESSE (http://adesse.uvigo.es/),
Spanish SenSem (http://grial.edu.es/sensem/lexico?idioma=en),
Spanish WordNet (https://adimen.si.ehu.es/cgi-bin/wei/public/wei.consult.perl),
AnCora (https://clic.ub.edu/corpus/en/ancoraverb_es), and
Spanish FrameNet (http://sfn.spanishfn.org/SFNreports.php).
The English entries are linked to
EngVallex (http://hdl.handle.net/11858/00-097C-0000-0023-4337-2),
CzEngVallex (http://hdl.handle.net/11234/1-1512),
FrameNet (https://framenet.icsi.berkeley.edu/)
VerbNet (https://uvi.colorado.edu/ and http://verbs.colorado.edu/verbnet/index.html),
PropBank (http://propbank.github.io/),
Ontonotes (http://clear.colorado.edu/compsem/index.php?page=lexicalresources&sub=ontonotes), and
English Wordnet (https://wordnet.princeton.edu/).
Czech entries are linked to
PDT-Vallex (http://hdl.handle.net/11858/00-097C-0000-0023-4338-F),
Vallex (http://hdl.handle.net/11234/1-3524), and
CzEngVallex (http://hdl.handle.net/11234/1-1512).
The German entries are linked to
Woxikon (https://synonyme.woxikon.de),
E-VALBU (https://grammis.ids-mannheim.de/verbvalenz), and
GUP (http://alanakbik.github.io/multilingual.html and https://github.com/UniversalDependencies/UD_German-GSD).
Ministerstvo školství, mládeže a tělovýchovy České republiky@@LM2010013@@LINDAT/CLARIN: Institut pro analýzu, zpracování a distribuci lingvistických dat@@nationalFunds@@✖[remove]3
Ministerstvo školství, mládeže a tělovýchovy České republiky@@LM2015071@@LINDAT/CLARIN: Institut pro analýzu, zpracování a distribuci lingvistických dat@@nationalFunds@@✖[remove]3