Skip to search
Skip to main content
Skip to first result
Search
Search Results
Creator:
Droganova, Kira , Ponomareva, Maria , Smurov, Ivan , and Shavrina, Tatiana
Publisher:
Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:
text and corpus
Subject:
linguistic data , gapping , and ellipsis
Language:
Russian
Description:
A test set that contains manually annotated sentences with gapping.
The test set was compiled from SynTagRus (v. 2015) the dependency treebank for Russian that provides comprehensive manually-corrected morphological and syntactic annotation.
Rights:
Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) , http://creativecommons.org/licenses/by-nc-sa/4.0/ , and PUB
Creator:
Popel, Martin
Publisher:
Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:
text and corpus
Subject:
parallel corpus
Language:
Czech and English
Description:
CzEng is a sentence-parallel Czech-English corpus compiled at the Institute of Formal and Applied Linguistics (ÚFAL). While the full CzEng 2.0 is freely available for non-commercial research purposes from the project website (https://ufal.mff.cuni.cz/czeng), this release contains only the original monolingual parts of news text (csmono 53M and enmono 79M sentences) with automatic (synthetic) translations by CUBBITT.
See the attached README for additional details such as the file format.
Rights:
Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) , http://creativecommons.org/licenses/by-sa/4.0/ , and PUB
Publisher:
Department of Informatics, Human Language Technology Group, University of Szeged
Format:
application/xml
Type:
corpus
Subject:
monolingual corpus , annotated corpus , and POS annotation
Language:
Hungarian
Description:
written, monolingual, general, manually POS annotated reference corpus; 1,247,546 tokens; MSD tagset, XML (TEIxLite) files
Rights:
Not specified
Publisher:
Department of Informatics, Human Language Technology Group, University of Szeged
Format:
application/xml
Type:
corpus
Subject:
monolingual corpus , annotated corpus , and POS annotation
Language:
Hungarian
Description:
written, monolingual, general, manually POS annotated reference corpus; 1,459,288 tokens; MSD tagset, XML (TEI P4) files
Rights:
Not specified
Publisher:
Department of Informatics, Human Language Technology Group, University of Szeged
Format:
application/xml
Type:
corpus
Language:
Hungarian
Description:
82,000 sentences with shallow syntactic annotation (NP-level).
Rights:
Not specified
Publisher:
Department of Informatics, Human Language Technology Group, University of Szeged
Format:
application/xml
Type:
corpus
Language:
Hungarian
Description:
82,000 sentences with full syntactic annotation.
Rights:
Not specified
Publisher:
Max Planck Institute for Evolutionary Anthropology
Type:
corpus
Description:
Documentation of the Taa project (DoBeS project)
Rights:
Code of conduct
Creator:
Sloot, Ko van der , Daelemans, Walter , Bosch, Antal van den , Zavrel, Jakub , Canisius, Sander , and Buchholz, Sabine
Publisher:
ILK, Tilburg University
Type:
toolService
Subject:
dependency parser
Language:
Dutch
Description:
An integrated tokenizer, tagger-lemmatizer, morphological analyzer, and dependency parser for Dutch
Rights:
Not specified
Creator:
Piasecki, Maciej , Godlewski, Grzegorz , Radziszewski, Adam , Broda, Bartosz , and Wardyński, Adam
Publisher:
Institute of Computer Science, Polish Academy of Sciences and Institute of Applied Informatics, Wrocław University of Technology
Type:
toolService
Subject:
morphosyntactic tagger
Description:
morphosyntactic tagger working on the IPI PAN Corpus tagset;
Rights:
Not specified
Type:
corpus
Language:
Swedish
Description:
appr. 85 kW, functional (traditional) syntactic roles (in TEI/XCES XML format)
Rights:
Not specified