Search
Search Results
- Type:
- corpus
- Language:
- Bulgarian and Croatian
- Description:
- written; domain-specific (newspaper); diachronic; bilingual; comparable; ca 3,500,000 tokens (393 Kw Bulgarian; 3.1 Mw Croatian)
- Rights:
- Not specified
- Publisher:
- University of Zagreb, Faculty of Humanities and Social Sciences
- Format:
- application/octet-stream
- Type:
- corpus
- Language:
- Croatian
- Description:
- Manually tagged dependency treebank, analytical layer according to the PDT formalism adapted for Croatian
- Rights:
- Not specified
- Publisher:
- University of Zagreb, Faculty of Humanities and Social Sciences
- Type:
- corpus
- Language:
- Croatian
- Description:
- written; reference corpus; general; synchornic; monolingual; 101,215,912 tokens
- Rights:
- Not specified
- Publisher:
- University of Zagreb, Faculty of Humanities and Social Sciences
- Type:
- corpus
- Language:
- Croatian and English
- Description:
- written; domain-specific (newspaper); synchronic; bilingual; parallel; unidirectional; XML; S-alignment
- Rights:
- Not specified
- Type:
- corpus
- Language:
- Croatian and French
- Description:
- written; domain-specific (fiction); diachronic (the French side); bilingual; parallel; ca 263,000 tokens (148 Kw French; 115 Kw Croatian); XML; S-alignment
- Rights:
- Not specified
- Publisher:
- Max Planck Institute for Psycholinguistics
- Type:
- corpus
- Language:
- Croatian, German, Russian, and Turkish
- Description:
- Language Acquisition corpus
- Rights:
- Not specified
- Publisher:
- University of Zagreb, Faculty of Humanities and Social Sciences
- Type:
- corpus
- Language:
- Croatian
- Description:
- written; reference corpus; general; diachornic; monolingual
- Rights:
- Not specified
- Publisher:
- University of Leipzig
- Type:
- corpus
- Language:
- Afrikaans, Albanian, Bulgarian, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Esperanto, Estonian, Finnish, French, German, Hungarian, Icelandic, Indonesian, Italian, Japanese, Korean, Latin, Latvian, Lithuanian, Malay (macrolanguage), Norwegian, Occitan (post 1500), Romanian, Russian, Slovak, Slovenian, Spanish, Sundanese, Swedish, Tagalog, Turkish, Vietnamese, and Welsh
- Description:
- Collected from newspaper texts, webcrawling, etc.: words (+frequency), cooccurrences (+graph), left/right neighbours, example sentences
- Rights:
- Not specified