Skip to search
Skip to main content
Skip to first result
Search
Search Results
Creator:
Mareček, David , Yu, Zhiwei , Zeman, Daniel , and Žabokrtský, Zdeněk
Publisher:
Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:
text and corpus
Subject:
part of speech , tagging , semi-supervised , and cross-language
Language:
Belarusian , Bosnian , Bulgarian , Czech , Serbo-Croatian , Croatian , Upper Sorbian , Macedonian , Polish , Russian , Slovak , Slovenian , Serbian , Ukrainian , Latvian , Lithuanian , Afrikaans , Danish , German , English , Faroese , Western Frisian , Swiss German , Icelandic , Limburgan , Luxembourgish , Low German , Dutch , Norwegian Nynorsk , Norwegian , Scots , Swedish , Yiddish , Aragonese , Asturian , Catalan , French , Galician , Haitian , Italian , Latin , Lombard , Neapolitan , Piemontese , Portuguese , Romanian , Spanish , Venetian , Walloon , Breton , Welsh , Scottish Gaelic , Irish , Modern Greek (1453-) , Armenian , Albanian , Dimli (individual language) , Persian , Gilaki , Kurdish , Tajik , Bengali , Bishnupriya , Gujarati , Fiji Hindi , Hindi , Marathi , Nepali (macrolanguage) , Urdu , Amharic , Arabic , Egyptian Arabic , Hebrew , Estonian , Finnish , Hungarian , Basque , Georgian , Chuvash , Azerbaijani , Turkish , Uzbek , Kazakh , Tatar , Yakut , Korean , Mongolian , Telugu , Kannada , Malayalam , Tamil , Newari , Vietnamese , Indonesian , Javanese , Malagasy , Maori , Malay (macrolanguage) , Pampanga , Sundanese , Tagalog , Waray (Philippines) , Swahili (macrolanguage) , Esperanto , Ido , Interlingua (International Auxiliary Language Association) , and Volapük
Description:
Texts in 107 languages from the W2C corpus (http://hdl.handle.net/11858/00-097C-0000-0022-6133-9), first 1,000,000 tokens per language, tagged by the delexicalized tagger described in Yu et al. (2016, LREC, Portorož, Slovenia).
Rights:
Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) , http://creativecommons.org/licenses/by-sa/4.0/ , and PUB
Creator:
Mareček, David , Yu, Zhiwei , Zeman, Daniel , and Žabokrtský, Zdeněk
Publisher:
Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:
text and corpus
Subject:
part of speech , tagging , semi-supervised , and cross-language
Language:
Belarusian , Bosnian , Bulgarian , Czech , Serbo-Croatian , Croatian , Upper Sorbian , Macedonian , Polish , Russian , Slovak , Slovenian , Serbian , Ukrainian , Latvian , Lithuanian , Afrikaans , Danish , German , English , Faroese , Western Frisian , Swiss German , Icelandic , Limburgan , Luxembourgish , Low German , Dutch , Norwegian Nynorsk , Norwegian , Scots , Swedish , Yiddish , Aragonese , Asturian , Catalan , French , Galician , Haitian , Italian , Latin , Lombard , Neapolitan , Piemontese , Portuguese , Romanian , Spanish , Venetian , Walloon , Breton , Welsh , Scottish Gaelic , Irish , Modern Greek (1453-) , Armenian , Albanian , Dimli (individual language) , Persian , Gilaki , Kurdish , Tajik , Bengali , Bishnupriya , Gujarati , Fiji Hindi , Hindi , Marathi , Nepali (macrolanguage) , Urdu , Amharic , Arabic , Egyptian Arabic , Hebrew , Estonian , Finnish , Hungarian , Basque , Georgian , Chuvash , Azerbaijani , Turkish , Uzbek , Kazakh , Tatar , Yakut , Korean , Mongolian , Telugu , Kannada , Malayalam , Tamil , Newari , Vietnamese , Indonesian , Javanese , Malagasy , Maori , Malay (macrolanguage) , Pampanga , Sundanese , Tagalog , Waray (Philippines) , Swahili (macrolanguage) , Esperanto , Ido , Interlingua (International Auxiliary Language Association) , and Volapük
Description:
Texts in 107 languages from the W2C corpus (http://hdl.handle.net/11858/00-097C-0000-0022-6133-9), first 1,000,000 tokens per language, tagged by the delexicalized tagger described in Yu et al. (2016, LREC, Portorož, Slovenia).
Changes in version 1.1:
1. Universal Dependencies tagset instead of the older and smaller Google Universal POS tagset.
2. SVM classifier trained on Universal Dependencies 1.2 instead of HamleDT 2.0.
3. Balto-Slavic languages, Germanic languages and Romance languages were tagged by classifier trained only on the respective group of languages. Other languages were tagged by a classifier trained on all available languages. The "c7" combination from version 1.0 is no longer used.
Rights:
Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) , http://creativecommons.org/licenses/by-sa/4.0/ , and PUB
Creator:
Zawistowska, Renata
Subject:
Maďaři slovenští , Maďaři , and Československo 1918-1992
Language:
Polish
Rights:
unknown
Creator:
Kłodnicki, Zygmunt,
Type:
text and studie
Subject:
Kulturní antropologie. Etnologie. Etnografie , démonologie , etnologie , metodologie , Polsko , přehledná zpracování světových dějin (chronologicky) , církevní a náboženské dějiny , and zahraniční národopis
Language:
Polish
Description:
Vorschläge zur Systematik der Volksdämonologie des Polnischen ethnographischen Atlas in Cieszyn.
Rights:
unknown
Creator:
Ptaśnik, Jan,
Type:
text and monografie
Subject:
Dějiny zemí střední Evropy , dějiny polské , haléř svatopetrský , papežství , důchody papežské , Polsko , papežství, církevní politika , and světové dějiny středověku (do r. 1492)
Language:
Polish
Rights:
unknown
Creator:
Dymowski, Arkadiusz
Type:
text and články
Subject:
Sochařství, keramika, porcelán, umělecké zpracování kovů , Augustus, , mince antické , mince, nálezy , starověké Řecko, Kréta , nálezy mincí , and Polsko
Language:
Polish
Description:
Římské republikánské denáry a denáry císaře Augusta nalezené v Polsku.
Rights:
unknown
Creator:
Gumowski, Marian,
Type:
text and studie
Subject:
Sochařství, keramika, porcelán, umělecké zpracování kovů , Vojtěch, , mincovnictví , mince , mince, denáry , jednotlivé mince , české země 895/906-1197 , and finančnictví
Language:
Polish
Rights:
unknown
Type:
text and sborníky konferenční
Subject:
Literatura (teorie) , holocaust , náměty literární , zahraniční periodika a sborníky , Československo 1989-1992 , české země od r. 1993 do současnosti , světové dějiny od r. 1945 do současnosti , literatura, spisovatelé , and dějiny umění, mecenát
Language:
German , Czech , English , and Polish
Rights:
unknown
Creator:
Schmilewski, Ulrich,
Publisher:
Verein für Geschichte Schlesiens e. V.,
Subject:
šlechta slezská , světové dějiny středověku (do r. 1492) , Polsko , šlechta, buržoazie, měšťanstvo, podnikatelé , and české země 1197-1306
Language:
Polish
Rights:
unknown
Creator:
Jan Patočka
Publisher:
Neue Zeitschrift für systematische Theologie und Religionsphilosophie 15 (Berlin 1973), seš. 3, str. 291–303. Stať. něm.
Type:
Text
Subject:
1973 , 1977/12 , 1978/7 , 1979/16 , 1979/31 , 1985/1 , 1986/2 , 1987/31 , 1990/4 , 1992/11 , 2004/1 , 2004/2 , 2004/7-8 , AS/UF-2 , AS/UF-4 , cs , de , fr , pl , SS-4/UČ-I , SS-5/UČ-II , and Stať. něm.
Language:
German , Czech , French , and Polish
Rights:
open access and Rights holder: Archiv Jana Patočky, z.s.