Skip to search
Skip to main content
Skip to first result
Search
Search Results
Creator:
Mareček, David , Yu, Zhiwei , Zeman, Daniel , and Žabokrtský, Zdeněk
Publisher:
Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:
text and corpus
Subject:
part of speech , tagging , semi-supervised , and cross-language
Language:
Belarusian , Bosnian , Bulgarian , Czech , Serbo-Croatian , Croatian , Upper Sorbian , Macedonian , Polish , Russian , Slovak , Slovenian , Serbian , Ukrainian , Latvian , Lithuanian , Afrikaans , Danish , German , English , Faroese , Western Frisian , Swiss German , Icelandic , Limburgan , Luxembourgish , Low German , Dutch , Norwegian Nynorsk , Norwegian , Scots , Swedish , Yiddish , Aragonese , Asturian , Catalan , French , Galician , Haitian , Italian , Latin , Lombard , Neapolitan , Piemontese , Portuguese , Romanian , Spanish , Venetian , Walloon , Breton , Welsh , Scottish Gaelic , Irish , Modern Greek (1453-) , Armenian , Albanian , Dimli (individual language) , Persian , Gilaki , Kurdish , Tajik , Bengali , Bishnupriya , Gujarati , Fiji Hindi , Hindi , Marathi , Nepali (macrolanguage) , Urdu , Amharic , Arabic , Egyptian Arabic , Hebrew , Estonian , Finnish , Hungarian , Basque , Georgian , Chuvash , Azerbaijani , Turkish , Uzbek , Kazakh , Tatar , Yakut , Korean , Mongolian , Telugu , Kannada , Malayalam , Tamil , Newari , Vietnamese , Indonesian , Javanese , Malagasy , Maori , Malay (macrolanguage) , Pampanga , Sundanese , Tagalog , Waray (Philippines) , Swahili (macrolanguage) , Esperanto , Ido , Interlingua (International Auxiliary Language Association) , and Volapük
Description:
Texts in 107 languages from the W2C corpus (http://hdl.handle.net/11858/00-097C-0000-0022-6133-9), first 1,000,000 tokens per language, tagged by the delexicalized tagger described in Yu et al. (2016, LREC, Portorož, Slovenia).
Changes in version 1.1:
1. Universal Dependencies tagset instead of the older and smaller Google Universal POS tagset.
2. SVM classifier trained on Universal Dependencies 1.2 instead of HamleDT 2.0.
3. Balto-Slavic languages, Germanic languages and Romance languages were tagged by classifier trained only on the respective group of languages. Other languages were tagged by a classifier trained on all available languages. The "c7" combination from version 1.0 is no longer used.
Rights:
Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) , http://creativecommons.org/licenses/by-sa/4.0/ , and PUB
Type:
text and sborníky konferenční
Subject:
Právo , Dějiny Česka a Slovenska , Karel , středověk , právo , and české (československé) sborníky a kolektivní monografie
Language:
Czech , French , German , and Slovak
Rights:
unknown
Type:
text and sborníky konferenční
Subject:
Právo , Dějiny Česka a Slovenska , Karel , středověk , právo , and české (československé) sborníky a kolektivní monografie
Language:
Czech , French , German , and Slovak
Rights:
unknown
Type:
text and sborníky jubilejní
Subject:
Filologie , Dostálová, Růžena, , filologie klasická , byzantologie , and české (československé) sborníky a kolektivní monografie
Language:
Czech , English , French , German , Modern Greek (1453-) , Latin , Slovak , and Spanish
Rights:
unknown
Publisher:
Slovenská národná knižnica,
Type:
sborníky konferenční
Subject:
Rukopisy, prvotisky, staré tisky. Vzácná a pozoruhodná díla , tisky staré , kultura knižní , literatura románská , zahraniční periodika a sborníky , světové dějiny novověku (1492-1918) , and dějiny knihy, knihtisk, nakladatelství
Language:
Slovak , Czech , Spanish , German , and French
Description:
Část. český, španělský, francouzský a německý text.
Rights:
unknown
Creator:
Biathová, Katarína,
Type:
text and monografie
Subject:
Malířství , gotika , umění výtvarné , malba gotická , malba desková , Slovensko 1301-1526 , and malířství, malíři
Language:
Slovak , Russian , German , English , and French
Description:
Rus., něm., angl. a franc. text
Rights:
unknown
Creator:
Zeman, Daniel , Mareček, David , Mašek, Jan , Popel, Martin , Ramasamy, Loganathan , Rosa, Rudolf , Štěpánek, Jan , and Žabokrtský, Zdeněk
Publisher:
Charles University
Type:
text and corpus
Subject:
annotated corpus , morphology , syntax , dependency , treebank , harmonized annotation , and common annotation style
Language:
Arabic , Basque , Bengali , Bulgarian , Catalan , Croatian , Czech , Danish , Dutch , English , Estonian , Finnish , French , German , Modern Greek (1453-) , Ancient Greek (to 1453) , Hebrew , Hindi , Hungarian , Indonesian , Irish , Italian , Japanese , Latin , Persian , Polish , Portuguese , Romanian , Russian , Slovak , Slovenian , Spanish , Swedish , Tamil , Telugu , and Turkish
Description:
HamleDT (HArmonized Multi-LanguagE Dependency Treebank) is a compilation of existing dependency treebanks (or dependency conversions of other treebanks), transformed so that they all conform to the same annotation style. This version uses Universal Dependencies as the common annotation style.
Update (November 1017): for a current collection of harmonized dependency treebanks, we recommend using the Universal Dependencies (UD). All of the corpora that are distributed in HamleDT in full are also part of the UD project; only some corpora from the Patch group (where HamleDT provides only the harmonizing scripts but not the full corpus data) are available in HamleDT but not in UD.
Rights:
HamleDT 3.0 License Terms , https://lindat.mff.cuni.cz/repository/xmlui/page/licence-hamledt-3.0 , and PUB
Publisher:
Joint Research Centre of the EU
Type:
corpus
Language:
Bulgarian , Czech , Danish , Dutch , English , Estonian , Finnish , French , German , Modern Greek (1453-) , Hungarian , Italian , Latvian , Maltese , Norwegian , Polish , Portuguese , Romanian , Slovak , Slovenian , Spanish , and Swedish
Description:
The largest parallel corpus, contains EU law, the Acquis Communautaire in 22 languages.
Rights:
Not specified
Creator:
Krajina a dům, vzdálenost a blízkost, nahoře a dole... (2004 : Praha, Česko) , Fedrová, Stanislava , Hejk, Jan , and Jedličková, Alice
Publisher:
Univerzita Karlova, Pedagogická fakulta
Format:
print and 233 s. : il. ; 21 cm
Type:
model:monograph and TEXT
Subject:
Česká literatura (o ní) , od 1989 , česká literatura , textová analýza , literární náměty , prostor (umění) , Czech literature , 1890- , textual criticism , literary themes , space (art) , 821.162.3 , 801.73 , 82:7.04 , 7.01 , (062.534) , 11 , and 821.162.3.09
Language:
Czech , Slovak , English , French , and German
Description:
Příspěvky studentské literárněvědné konference PedF UK., k vydání připravili Stanislava Fedrová, Jan Hejk, Alice Jedličková, Obsahuje bibliografie a bibliografické odkazy, and Část. slovenský text, anglická, francouzská a německá resumé
Rights:
http://creativecommons.org/licenses/by-nc-sa/4.0/ and policy:public
Type:
text and sborníky konferenční
Subject:
Právo , Kristián z Koldína, Pavel, , právo městské , města, obce , světové dějiny novověku (1492-1918) , and české (československé) sborníky a kolektivní monografie
Language:
Czech , French , German , Russian , and Slovak
Description:
Bibliografii zpracoval Karel Schelle
Rights:
unknown