Skip to search
Skip to main content
Skip to first result
Search
Search Results
Creator:
Zeman, Daniel and Droganova, Kira
Publisher:
Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:
text and corpus
Subject:
semantic dependency and universal dependencies
Language:
Afrikaans , Assyrian Neo-Aramaic , Akkadian , Amharic , Arabic , Belarusian , Breton , Bulgarian , Russia Buriat , Catalan , Czech , Church Slavic , Mandarin Chinese , Coptic , Welsh , Danish , German , Modern Greek (1453-) , English , Estonian , Basque , Faroese , Finnish , French , Irish , Gothic , Ancient Greek (to 1453) , Mbyá Guaraní , Hebrew , Hindi , Croatian , Upper Sorbian , Hungarian , Armenian , Indonesian , Italian , Japanese , Kazakh , Northern Kurdish , Korean , Komi-Zyrian , Karelian , Latin , Latvian , Lithuanian , Literary Chinese , Marathi , Erzya , Dutch , Norwegian , Old Russian , Nigerian Pidgin , Polish , Portuguese , Romanian , Russian , Sanskrit , Slovak , Slovenian , Northern Sami , Spanish , Serbian , Swedish , Tamil , Tagalog , Turkish , Ukrainian , Urdu , Vietnamese , Warlpiri , Wolof , Yoruba , Galician , Bhojpuri , Komi-Permyak , Livvi , Moksha , Scottish Gaelic , Skolt Sami , Icelandic , Albanian , Persian , Akuntsu , Apurinã , Khunsari , Manx , Mundurukú , Nayini , Soi , South Levantine Arabic , Tupinambá , Beja , Western Frisian , Urubú-Kaapor , Kangri , K'iche' , Low German , Makuráp , Western Armenian , and Central Siberian Yupik
Description:
Deep Universal Dependencies is a collection of treebanks derived semi-automatically from Universal Dependencies (http://hdl.handle.net/11234/1-3687). It contains additional deep-syntactic and semantic annotations. Version of Deep UD corresponds to the version of UD it is based on. Note however that some UD treebanks have been omitted from Deep UD.
Rights:
Licence Universal Dependencies v2.8 , https://lindat.mff.cuni.cz/repository/xmlui/page/license-ud-2.8 , and PUB
Creator:
Mareček, David , Yu, Zhiwei , Zeman, Daniel , and Žabokrtský, Zdeněk
Publisher:
Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:
text and corpus
Subject:
part of speech , tagging , semi-supervised , and cross-language
Language:
Belarusian , Bosnian , Bulgarian , Czech , Serbo-Croatian , Croatian , Upper Sorbian , Macedonian , Polish , Russian , Slovak , Slovenian , Serbian , Ukrainian , Latvian , Lithuanian , Afrikaans , Danish , German , English , Faroese , Western Frisian , Swiss German , Icelandic , Limburgan , Luxembourgish , Low German , Dutch , Norwegian Nynorsk , Norwegian , Scots , Swedish , Yiddish , Aragonese , Asturian , Catalan , French , Galician , Haitian , Italian , Latin , Lombard , Neapolitan , Piemontese , Portuguese , Romanian , Spanish , Venetian , Walloon , Breton , Welsh , Scottish Gaelic , Irish , Modern Greek (1453-) , Armenian , Albanian , Dimli (individual language) , Persian , Gilaki , Kurdish , Tajik , Bengali , Bishnupriya , Gujarati , Fiji Hindi , Hindi , Marathi , Nepali (macrolanguage) , Urdu , Amharic , Arabic , Egyptian Arabic , Hebrew , Estonian , Finnish , Hungarian , Basque , Georgian , Chuvash , Azerbaijani , Turkish , Uzbek , Kazakh , Tatar , Yakut , Korean , Mongolian , Telugu , Kannada , Malayalam , Tamil , Newari , Vietnamese , Indonesian , Javanese , Malagasy , Maori , Malay (macrolanguage) , Pampanga , Sundanese , Tagalog , Waray (Philippines) , Swahili (macrolanguage) , Esperanto , Ido , Interlingua (International Auxiliary Language Association) , and Volapük
Description:
Texts in 107 languages from the W2C corpus (http://hdl.handle.net/11858/00-097C-0000-0022-6133-9), first 1,000,000 tokens per language, tagged by the delexicalized tagger described in Yu et al. (2016, LREC, Portorož, Slovenia).
Rights:
Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) , http://creativecommons.org/licenses/by-sa/4.0/ , and PUB
Creator:
Mareček, David , Yu, Zhiwei , Zeman, Daniel , and Žabokrtský, Zdeněk
Publisher:
Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:
text and corpus
Subject:
part of speech , tagging , semi-supervised , and cross-language
Language:
Belarusian , Bosnian , Bulgarian , Czech , Serbo-Croatian , Croatian , Upper Sorbian , Macedonian , Polish , Russian , Slovak , Slovenian , Serbian , Ukrainian , Latvian , Lithuanian , Afrikaans , Danish , German , English , Faroese , Western Frisian , Swiss German , Icelandic , Limburgan , Luxembourgish , Low German , Dutch , Norwegian Nynorsk , Norwegian , Scots , Swedish , Yiddish , Aragonese , Asturian , Catalan , French , Galician , Haitian , Italian , Latin , Lombard , Neapolitan , Piemontese , Portuguese , Romanian , Spanish , Venetian , Walloon , Breton , Welsh , Scottish Gaelic , Irish , Modern Greek (1453-) , Armenian , Albanian , Dimli (individual language) , Persian , Gilaki , Kurdish , Tajik , Bengali , Bishnupriya , Gujarati , Fiji Hindi , Hindi , Marathi , Nepali (macrolanguage) , Urdu , Amharic , Arabic , Egyptian Arabic , Hebrew , Estonian , Finnish , Hungarian , Basque , Georgian , Chuvash , Azerbaijani , Turkish , Uzbek , Kazakh , Tatar , Yakut , Korean , Mongolian , Telugu , Kannada , Malayalam , Tamil , Newari , Vietnamese , Indonesian , Javanese , Malagasy , Maori , Malay (macrolanguage) , Pampanga , Sundanese , Tagalog , Waray (Philippines) , Swahili (macrolanguage) , Esperanto , Ido , Interlingua (International Auxiliary Language Association) , and Volapük
Description:
Texts in 107 languages from the W2C corpus (http://hdl.handle.net/11858/00-097C-0000-0022-6133-9), first 1,000,000 tokens per language, tagged by the delexicalized tagger described in Yu et al. (2016, LREC, Portorož, Slovenia).
Changes in version 1.1:
1. Universal Dependencies tagset instead of the older and smaller Google Universal POS tagset.
2. SVM classifier trained on Universal Dependencies 1.2 instead of HamleDT 2.0.
3. Balto-Slavic languages, Germanic languages and Romance languages were tagged by classifier trained only on the respective group of languages. Other languages were tagged by a classifier trained on all available languages. The "c7" combination from version 1.0 is no longer used.
Rights:
Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) , http://creativecommons.org/licenses/by-sa/4.0/ , and PUB
Creator:
Rábik, Vladimír,
Type:
text , prameny , and diplomatáře
Subject:
Historická věda. Pomocné vědy historické. Archivnictví , listiny uherské , diplomatika , edice , rody šlechtické , šlechta, buržoazie, měšťanstvo, podnikatelé , Slovensko 1301-1526 , and diplomatika, edice
Language:
Slovak , Latin , and English
Rights:
unknown
Creator:
Rábik, Vladimír,
Type:
text , prameny , and diplomatáře
Subject:
Historická věda. Pomocné vědy historické. Archivnictví , listiny uherské , diplomatika , edice , rody šlechtické , šlechta, buržoazie, měšťanstvo, podnikatelé , diplomatika, edice , and Slovensko 1301-1526
Language:
Slovak , Latin , and English
Description:
Obsahuje rejstříky
Rights:
unknown
Creator:
Hunčaga, Gabriel Peter,
Publisher:
Towarzystwo Słowaków w Polsce ; and Chronos,
Type:
monografie
Subject:
Křesťanská sdružení, spolky a organizace. Řeholní řády , řád, dominikáni , dějiny církevní , vzdělání , školství řádové , knihovny řádové , světové dějiny středověku (do r. 1492) , Slovensko 1197-1301 , české země 1197-1306 , církevní řády a kongregace, náboženská bratrstva, kláštery , and dějiny vědy, umění, kultury a techniky, kulturní vztahy
Language:
Slovak and Latin
Description:
"Vydalo Centrum pre štúdium kresťanstva vo vydavateľstve Chronos v spolupráci so Spolkom Slovákov v Poľsku a so Slovenskou komisiou pre komparatívne cirkevné dejiny--S. [6]
Rights:
unknown
Type:
text and sborníky jubilejní
Subject:
Filologie , Dostálová, Růžena, , filologie klasická , byzantologie , and české (československé) sborníky a kolektivní monografie
Language:
Czech , English , French , German , Modern Greek (1453-) , Latin , Slovak , and Spanish
Rights:
unknown
Format:
print
Type:
model:supplement and TEXT
Language:
Czech , Latin , Slovak , and French
Description:
Obsah ročníku LXX (2022)
Rights:
http://creativecommons.org/licenses/by-nc-sa/4.0/ and policy:public
Type:
text and sborníky konferenční
Subject:
Křesťanské církve, sekty, denominace , ekumenismus , křesťanství , církevní a náboženské dějiny , české (československé) sborníky a kolektivní monografie , and české země 1848-1914
Language:
Czech , Latin , and Slovak
Description:
Vydáno k 100. výročí Prvního unionistického kongresu na Velehradě v r. 1907
Rights:
unknown
Creator:
Bernolák, Anton,
Type:
text , studie , and prameny
Subject:
Slovenská literatura , edice , lingvistika , národní obrození , jazyk, písmo , Slovensko 1711-1780 , Slovensko 1780-1847 , and dějiny literatury, jazyka a knihy
Language:
Slovak and Latin
Description:
Přeloženo z latiny
Rights:
unknown