Format: application/xml / Original context has metadata only: true / Rights: Not specified - LINDAT/CLARIAH-CZ Catalog Search Results

Start Over Format application/xml Rights Not specified Original context has metadata only true

1. Alpino Treebank

Publisher:: Center for Language and Cognition
Format:: application/xml
Type:: corpus
Language:: Dutch
Description:: A database of 7.000 syntactically analyzed Dutch sentences.
Rights:: Not specified

2. Bilingual English-Lithuanian, Lithuanian-English, Czech-Lithuanian, Lithuanian-Czech corpora

Publisher:: Center of Computational Linguistics, Vytautas Magnus University
Format:: application/xml
Type:: corpus
Language:: Czech, English, and Lithuanian
Description:: A collection of parallel corpora: English-Lithuanian (2m words), Lithuanian-English (0,06m words), Czech-Lithuanian (0,8m words), Lithuanian-Czech (0,02m words). All the corpora are online-searcheable via one interface at http://donelaitis.vdu.lt/main_en.php?id=4&nr=1_2. The corpus is still being updated with new texts.
Rights:: Not specified

3. Concise dictionary of Latvian

Publisher:: Institute of Mathematics and Computer Science, University of Latvia
Format:: application/xml
Type:: lexicalConceptualResource
Language:: Latvian
Description:: > 25 000 entries
Rights:: Not specified

4. Hungarian Historical Corpus

Publisher:: Academy of Sciences
Format:: application/xml
Type:: corpus
Language:: Hungarian
Description:: Containing 27 million running words the Hungarian Historical Corpus provides a valuable basis for research on the history of words of Hungarian between the second half of the 18th century and 2000.
Rights:: Not specified

5. Hungarian National Corpus

Publisher:: Academy of Sciences
Format:: application/xml
Type:: corpus
Subject:: synchronic corpus
Language:: Hungarian
Description:: Written general synchronic reference corpus; 190m tokens; POS annotated XML
Rights:: Not specified

6. Szeged Corpus 1.0

Publisher:: Department of Informatics, Human Language Technology Group, University of Szeged
Format:: application/xml
Type:: corpus
Subject:: monolingual corpus, annotated corpus, and POS annotation
Language:: Hungarian
Description:: written, monolingual, general, manually POS annotated reference corpus; 1,247,546 tokens; MSD tagset, XML (TEIxLite) files
Rights:: Not specified

7. Szeged Corpus 2.0

Publisher:: Department of Informatics, Human Language Technology Group, University of Szeged
Format:: application/xml
Type:: corpus
Subject:: monolingual corpus, annotated corpus, and POS annotation
Language:: Hungarian
Description:: written, monolingual, general, manually POS annotated reference corpus; 1,459,288 tokens; MSD tagset, XML (TEI P4) files
Rights:: Not specified

8. Szeged Treebank 1.0

Publisher:: Department of Informatics, Human Language Technology Group, University of Szeged
Format:: application/xml
Type:: corpus
Language:: Hungarian
Description:: 82,000 sentences with shallow syntactic annotation (NP-level).
Rights:: Not specified

9. Szeged Treebank 2.0

Publisher:: Department of Informatics, Human Language Technology Group, University of Szeged
Format:: application/xml
Type:: corpus
Language:: Hungarian
Description:: 82,000 sentences with full syntactic annotation.
Rights:: Not specified