The SynSemClass synonym verb lexicon version 4.0 investigates, with respect to contextually-based verb synonymy, semantic ‘equivalence’ of Czech, English, and German verb senses and their valency behavior in parallel Czech-English and German-English language resources. SynSemClass 4.0 is a multilingual event-type ontology based on classes of synonymous verb senses, complemented with semantic roles and links to existing semantic lexicons. The version 4.0 is not only enriched by an additional number of classes but in the context of content hierarchy, some classes have been merged. Compared to the older versions of the lexicon, the novelty is the definitions of classes and the definitions of roles.
Czech lexicon entries are linked to PDT-Vallex (http://hdl.handle.net/11858/00-097C-0000-0023-4338-F), Vallex (http://hdl.handle.net/11234/1-3524), and CzEngVallex (http://hdl.handle.net/11234/1-1512). The English lexicon entries are linked to EngVallex (http://hdl.handle.net/11858/00-097C-0000-0023-4337-2), CzEngVallex (http://hdl.handle.net/11234/1-1512), FrameNet (https://framenet.icsi.berkeley.edu/fndrupal/), VerbNet (https://uvi.colorado.edu/ and http://verbs.colorado.edu/verbnet/index.html), PropBank (http://propbank.github.io/), Ontonotes (http://clear.colorado.edu/compsem/index.php?page=lexicalresources&sub=ontonotes), and English Wordnet (https://wordnet.princeton.edu/). The German lexicon entries are linked to Woxikon (https://synonyme.woxikon.de), E-VALBU (https://grammis.ids-mannheim.de/verbvalenz), and GUP (http://alanakbik.github.io/multilingual.html; https://github.com/UniversalDependencies/UD_German-GSD).
The SynSemClass synonym verb lexicon version 5.0 is a multilingual resource that enriches previous editions of this event-type ontology with a new language, Spanish. The existing languages, English, Czech and German, are further substantially extended by a larger number of classes. SSC 5.0 data also contain lists (in a separate removed_cms.zip file) with originally (pre-)proposed but later rejected class members. All languages are organized into classes and have links to other lexical sources. In addition to the existing links, links to Spanish sources have been added.
The Spanish entries are linked to
ADESSE (http://adesse.uvigo.es/),
Spanish SenSem (http://grial.edu.es/sensem/lexico?idioma=en),
Spanish WordNet (https://adimen.si.ehu.es/cgi-bin/wei/public/wei.consult.perl),
AnCora (https://clic.ub.edu/corpus/en/ancoraverb_es), and
Spanish FrameNet (http://sfn.spanishfn.org/SFNreports.php).
The English entries are linked to
EngVallex (http://hdl.handle.net/11858/00-097C-0000-0023-4337-2),
CzEngVallex (http://hdl.handle.net/11234/1-1512),
FrameNet (https://framenet.icsi.berkeley.edu/)
VerbNet (https://uvi.colorado.edu/ and http://verbs.colorado.edu/verbnet/index.html),
PropBank (http://propbank.github.io/),
Ontonotes (http://clear.colorado.edu/compsem/index.php?page=lexicalresources&sub=ontonotes), and
English Wordnet (https://wordnet.princeton.edu/).
Czech entries are linked to
PDT-Vallex (http://hdl.handle.net/11858/00-097C-0000-0023-4338-F),
Vallex (http://hdl.handle.net/11234/1-3524), and
CzEngVallex (http://hdl.handle.net/11234/1-1512).
The German entries are linked to
Woxikon (https://synonyme.woxikon.de),
E-VALBU (https://grammis.ids-mannheim.de/verbvalenz), and
GUP (http://alanakbik.github.io/multilingual.html and https://github.com/UniversalDependencies/UD_German-GSD).
The SynSemClass Search Tool provides a web search tool for the SynSemClass 5.0 ontology. It includes several search options and criteria for building complex queries. The search results are rendered in a clear and user-friendly interactive representation.
A collection of pointers to teaching and learning materials on linguistics and linguistic tools, including quick starts, how-tos, technical documentation, short teaching modules (2h), and full courses. This resource is collaboratively built by its users.
Test data for the WMT 2017 Automatic post-editing task (the same used for the Sentence-level Quality Estimation task). They consist in German-English triplets (source and target) belonging to the pharmacological domain and already tokenized. Test set contains 2,000 pairs. All data is provided by the EU project QT21 (http://www.qt21.eu/).
Test data for the WMT 2017 Automatic post-editing task (the same used for the Sentence-level Quality Estimation task). They consist in 2,000 English-German pairs (source and target) belonging to the IT domain and already tokenized. All data is provided by the EU project QT21 (http://www.qt21.eu/).
Test data for the WMT 2018 Automatic post-editing task. They consist in English-German pairs (source and target) belonging to the information technology domain and already tokenized. Test set contains 1,023 pairs. A neural machine translation system has been used to generate the target segments. All data is provided by the EU project QT21 (http://www.qt21.eu/).
Test data for the WMT 2018 Automatic post-editing task. They consist in English-German pairs (source and target) belonging to the information technology domain and already tokenized. Test set contains 2,000 pairs. A phrase-based machine translation system has been used to generate the target segments. This test set is sampled from the same dataset used for the 2016 and 2017 APE shared task editions. All data is provided by the EU project QT21 (http://www.qt21.eu/).
TextGrid has purchased the Zeno.org online library (literary, historical, scientific, ... texts) and successively converts it to TEI. TextGrid hat die Online-Bibliothek von Zeno.org (literarische, naturwissenschaftliche, historische, ... Texte) erworben und konvertiert diese sukzessive in ein gültiges TEI-Format.
i.a. collection of old herbal books, old cookery books and texts on the history of German language in print media; u.a. eine Sammlung von alten Kräuterbüchern, alten Kochbüchern und Texten zur Geschichte der deutschen Pressesprache