« Previous |
1 - 100 of 161
|
Next »
Number of results to display per page
Search Results
2. Acta onomastica
- Type:
- text and sborníky
- Subject:
- Seriálové publikace. Periodika, onomastika, toponomastika, and česká periodika
- Language:
- Czech, Russian, English, Slovak, and Polish
- Rights:
- unknown
3. Acta onomastica
- Type:
- text and časopisy
- Subject:
- Seriálové publikace. Periodika, onomastika, toponomastika, and česká periodika
- Language:
- Czech, Russian, English, Slovak, and Polish
- Rights:
- unknown
4. Aktuální otázky slovanské filologie a Šafaříkův vědecký odkaz /
- Type:
- text and sborníky
- Subject:
- Filologie, Šafařík, Pavel Josef,, slavistika, slavisté, filologie slovanská, české (československé) sborníky a kolektivní monografie, české země 1792-1918, and dějiny slavistiky
- Language:
- Czech, English, German, Italian, Polish, Russian, and Slovak
- Description:
- Zvl. otisk čas. Slavia 65 (1996), seš. 1, str. 1-162
- Rights:
- unknown
5. Bibliografický přehled českých národních písní: seznam studií, starších sbírek rukopisných, sbírek tištěných, překladů s vybranými ukázkami a podrobný abecední ukazatel písní, v knize uvedených i vůbec písní tiskem uveřejněných
- Creator:
- Čeněk Zíbrt and Česká akademie císaře Františka Josefa pro vědy, slovesnost a umění
- Publisher:
- Nákladem České akademie císaře Františka Josefa pro vědy, slovesnost a umění
- Format:
- print, svazek, and 326 stran.
- Type:
- model:monograph and TEXT
- Subject:
- Vokální hudba, Bibliografie. Katalogy, české lidové písně, historické prameny, Česko, 784.4(=162.3), (016), (437.3), 9, 12, 784, and 01
- Language:
- Czech, English, French, German, Italian, Latin, Polish, and Russian
- Description:
- sestavil Čeněk Zíbrt., Obsahuje rejstříky., Částečně souběžný anglický, francouzský, německý, italský, latinský, polský a ruský text, and Vydává III. třída České akademie císaře Františka Josefa pro vědy, slovesnost a umění v Praze
- Rights:
- http://creativecommons.org/publicdomain/mark/1.0/ and policy:public
6. C4Corpus (CC BY-NC part)
- Creator:
- Gurevych, Iryna, Habernal, Ivan, and Zayed, Omnia
- Publisher:
- Technische Universität Darmstadt
- Type:
- text and corpus
- Subject:
- CommonCrawl, Creative Commons, Web corpus, and Amazon Web Services
- Language:
- Afrikaans, Arabic, Bengali, Bulgarian, Czech, Danish, German, Modern Greek (1453-), English, Estonian, Persian, Finnish, French, Hebrew, Hindi, Croatian, Hungarian, Indonesian, Italian, Japanese, Kannada, Korean, Latvian, Lithuanian, Malayalam, Macedonian, Nepali (macrolanguage), Dutch, Norwegian, Panjabi, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Somali, Spanish, Albanian, Swahili (macrolanguage), Swedish, Tamil, Telugu, Tagalog, Thai, Turkish, Ukrainian, Undetermined, Vietnamese, and Chinese
- Description:
- A large web corpus (over 10 billion tokens) licensed under CreativeCommons license family in 50+ languages that has been extracted from CommonCrawl, the largest publicly available general Web crawl to date with about 2 billion crawled URLs.
- Rights:
- Creative Commons - Attribution-NonCommercial 4.0 International (CC BY-NC 4.0), http://creativecommons.org/licenses/by-nc/4.0/, and PUB
7. C4Corpus (CC BY-NC-ND part)
- Creator:
- Gurevych, Iryna, Habernal, Ivan, and Zayed, Omnia
- Publisher:
- Technische Universität Darmstadt
- Type:
- text and corpus
- Subject:
- CommonCrawl, Creative Commons, Web corpus, and Amazon Web Services
- Language:
- Afrikaans, Arabic, Bengali, Bulgarian, Czech, Danish, German, Modern Greek (1453-), English, Estonian, Persian, Finnish, French, Gujarati, Hebrew, Hindi, Croatian, Hungarian, Indonesian, Italian, Japanese, Kannada, Korean, Latvian, Lithuanian, Malayalam, Marathi, Macedonian, Nepali (macrolanguage), Dutch, Norwegian, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Somali, Spanish, Albanian, Swahili (macrolanguage), Swedish, Tamil, Telugu, Tagalog, Thai, Turkish, Ukrainian, Undetermined, Urdu, Vietnamese, and Chinese
- Description:
- A large web corpus (over 10 billion tokens) licensed under CreativeCommons license family in 50+ languages that has been extracted from CommonCrawl, the largest publicly available general Web crawl to date with about 2 billion crawled URLs.
- Rights:
- Creative Commons - Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0), http://creativecommons.org/licenses/by-nc-nd/4.0/, and PUB
8. C4Corpus (CC BY-NC-SA part)
- Creator:
- Gurevych, Iryna, Habernal, Ivan, and Zayed, Omnia
- Publisher:
- Technische Universität Darmstadt
- Type:
- text and corpus
- Subject:
- CommonCrawl, Creative Commons, Web corpus, and Amazon Web Services
- Language:
- Afrikaans, Arabic, Bengali, Bulgarian, Czech, Danish, German, Modern Greek (1453-), English, Estonian, Persian, Finnish, French, Gujarati, Hebrew, Hindi, Croatian, Hungarian, Indonesian, Italian, Japanese, Korean, Latvian, Lithuanian, Malayalam, Marathi, Macedonian, Nepali (macrolanguage), Dutch, Norwegian, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Somali, Spanish, Albanian, Swahili (macrolanguage), Swedish, Tamil, Telugu, Tagalog, Thai, Turkish, Ukrainian, Undetermined, Urdu, Vietnamese, and Chinese
- Description:
- A large web corpus (over 10 billion tokens) licensed under CreativeCommons license family in 50+ languages that has been extracted from CommonCrawl, the largest publicly available general Web crawl to date with about 2 billion crawled URLs.
- Rights:
- Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB
9. C4Corpus (CC BY-ND part)
- Creator:
- Gurevych, Iryna, Habernal, Ivan, and Zayed, Omnia
- Publisher:
- Technische Universität Darmstadt
- Type:
- text and corpus
- Subject:
- CommonCrawl, Creative Commons, Web corpus, and Amazon Web Services
- Language:
- Afrikaans, Arabic, Bengali, Bulgarian, Czech, Danish, German, Modern Greek (1453-), English, Estonian, Persian, Finnish, French, Gujarati, Hebrew, Hindi, Croatian, Hungarian, Indonesian, Italian, Japanese, Korean, Latvian, Lithuanian, Malayalam, Macedonian, Dutch, Norwegian, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Somali, Spanish, Albanian, Swahili (macrolanguage), Swedish, Tamil, Tagalog, Thai, Turkish, Ukrainian, Undetermined, Vietnamese, and Chinese
- Description:
- A large web corpus (over 10 billion tokens) licensed under CreativeCommons license family in 50+ languages that has been extracted from CommonCrawl, the largest publicly available general Web crawl to date with about 2 billion crawled URLs.
- Rights:
- Creative Commons - Attribution-NoDerivatives 4.0 International (CC BY-ND 4.0), http://creativecommons.org/licenses/by-nc/4.0/, and PUB
10. C4Corpus (CC BY-SA part)
- Creator:
- Gurevych, Iryna, Habernal, Ivan, and Zayed, Omnia
- Publisher:
- Technische Universität Darmstadt
- Type:
- text and corpus
- Subject:
- CommonCrawl, Creative Commons, Web corpus, and Amazon Web Services
- Language:
- Afrikaans, Arabic, Bengali, Bulgarian, Czech, Danish, German, Modern Greek (1453-), English, Estonian, Persian, Finnish, French, Gujarati, Hebrew, Hindi, Croatian, Hungarian, Indonesian, Italian, Japanese, Kannada, Korean, Latvian, Lithuanian, Malayalam, Marathi, Macedonian, Nepali (macrolanguage), Dutch, Norwegian, Panjabi, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Somali, Spanish, Albanian, Swahili (macrolanguage), Swedish, Tamil, Telugu, Tagalog, Thai, Turkish, Ukrainian, Undetermined, Urdu, Vietnamese, and Chinese
- Description:
- A large web corpus (over 10 billion tokens) licensed under CreativeCommons license family in 50+ languages that has been extracted from CommonCrawl, the largest publicly available general Web crawl to date with about 2 billion crawled URLs.
- Rights:
- Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0), http://creativecommons.org/licenses/by-sa/4.0/, and PUB
11. C4Corpus (CC-BY part)
- Creator:
- Gurevych, Iryna, Habernal, Ivan, and Zayed, Omnia
- Publisher:
- Technische Universität Darmstadt
- Type:
- text and corpus
- Subject:
- CommonCrawl, Creative Commons, Web corpus, and Amazon Web Services
- Language:
- Afrikaans, Arabic, Bengali, Bulgarian, Czech, Danish, German, Modern Greek (1453-), English, Estonian, Persian, Finnish, French, Gujarati, Hebrew, Hindi, Croatian, Hungarian, Indonesian, Italian, Japanese, Kannada, Korean, Latvian, Lithuanian, Malayalam, Marathi, Macedonian, Nepali (macrolanguage), Dutch, Norwegian, Panjabi, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Somali, Spanish, Albanian, Swahili (macrolanguage), Swedish, Tamil, Telugu, Tagalog, Thai, Turkish, Ukrainian, Undetermined, Urdu, Vietnamese, and Chinese
- Description:
- A large web corpus (over 10 billion tokens) licensed under CreativeCommons license family in 50+ languages that has been extracted from CommonCrawl, the largest publicly available general Web crawl to date with about 2 billion crawled URLs.
- Rights:
- Creative Commons - Attribution 4.0 International (CC BY 4.0), http://creativecommons.org/licenses/by/4.0/, and PUB
12. C4Corpus (publicdomain part)
- Creator:
- Gurevych, Iryna, Habernal, Ivan, and Zayed, Omnia
- Publisher:
- Technische Universität Darmstadt
- Type:
- text and corpus
- Subject:
- CommonCrawl, Creative Commons, Web corpus, and Amazon Web Services
- Language:
- Afrikaans, Arabic, Bulgarian, Czech, Danish, German, Modern Greek (1453-), English, Estonian, Persian, Finnish, French, Croatian, Hungarian, Indonesian, Italian, Japanese, Korean, Latvian, Lithuanian, Dutch, Norwegian, Polish, Portuguese, Russian, Slovenian, Somali, Spanish, Swahili (macrolanguage), Swedish, Tagalog, Thai, Turkish, Ukrainian, Undetermined, and Vietnamese
- Description:
- A large web corpus (over 10 billion tokens) licensed under CreativeCommons license family in 50+ languages that has been extracted from CommonCrawl, the largest publicly available general Web crawl to date with about 2 billion crawled URLs.
- Rights:
- Public Domain Mark (PD), http://creativecommons.org/publicdomain/mark/1.0/, and PUB
13. Česká a slovenská slavistická komparatistika a wollmanovská tradice /
- Type:
- text and monografie kolektivní
- Subject:
- Filologie, Wollman, Frank,, slavisté, slavistika, lingvistika komparativní, komparatistika literární, and české (československé) sborníky a kolektivní monografie
- Language:
- Czech, English, Polish, Russian, Slovak, and Ukrainian
- Description:
- Vychází ve spolupráci se Středoevropským centrem slovanských studií, Slavistickou společností Franka Wollmana a Ústavem slavistiky FF MU and Vydal Jan Sojnek - Galium
- Rights:
- unknown
14. Československá zahraniční politika v roce 1943.
- Type:
- text, dokumenty, and edice
- Subject:
- Mezinárodní vztahy, světová politika, politika zahraniční, vztahy mezinárodní, vláda exilová, válka druhá světová (1939-1945), odboj druhý (protifašistický), Československo 1938-1945, and zahraniční politika, mezinárodní vztahy
- Language:
- Czech, English, French, Polish, Russian, and Slovak
- Description:
- Autentické dokumenty odhalující politické a diplomatické vztahy československé politické reprezentace k velmocím i dalším státům od počátku srpna do konce prosince roku 1943.
- Rights:
- unknown
15. CoNLL 2017 and 2018 Shared Task Blind and Preprocessed Test Data
- Creator:
- Zeman, Daniel and Straka, Milan
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- tokenization, word segmentation, morphology, tagging, syntax, parsing, and universal dependencies
- Language:
- Afrikaans, Arabic, Breton, Bulgarian, Russia Buriat, Catalan, Czech, Church Slavic, Danish, German, Modern Greek (1453-), English, Estonian, Basque, Faroese, Persian, Finnish, French, Old French (842-ca. 1400), Irish, Galician, Gothic, Ancient Greek (to 1453), Hebrew, Hindi, Croatian, Upper Sorbian, Hungarian, Armenian, Indonesian, Italian, Japanese, Kazakh, Northern Kurdish, Korean, Latin, Latvian, Dutch, Norwegian, Nigerian Pidgin, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Northern Sami, Spanish, Serbian, Swedish, Thai, Turkish, Uighur, Ukrainian, Urdu, Vietnamese, and Chinese
- Description:
- CoNLL 2017 and 2018 shared tasks: Multilingual Parsing from Raw Text to Universal Dependencies This package contains the test data in the form in which they ware presented to the participating systems: raw text files and files preprocessed by UDPipe. The metadata.json files contain lists of files to process and to output; README files in the respective folders describe the syntax of metadata.json. For full training, development and gold standard test data, see Universal Dependencies 2.0 (CoNLL 2017) Universal Dependencies 2.2 (CoNLL 2018) See the download links at http://universaldependencies.org/. For more information on the shared tasks, see http://universaldependencies.org/conll17/ http://universaldependencies.org/conll18/ Contents: conll17-ud-test-2017-05-09 ... CoNLL 2017 test data conll18-ud-test-2018-05-06 ... CoNLL 2018 test data conll18-ud-test-2018-05-06-for-conll17 ... CoNLL 2018 test data with metadata and filenames modified so that it is digestible by the 2017 systems.
- Rights:
- Licence Universal Dependencies v2.2, https://lindat.mff.cuni.cz/repository/xmlui/page/licence-UD-2.2, and PUB
16. CoNLL 2017 Shared Task System Outputs
- Creator:
- Zeman, Daniel, Potthast, Martin, Straka, Milan, Popel, Martin, Dozat, Timothy, Qi, Peng, Manning, Christopher, Shi, Tianze, Wu, Felix G., Chen, Xilun, Cheng, Yao, Björkelund, Anders, Falenska, Agnieszka, Yu, Xiang, Kuhn, Jonas, Che, Wanxiang, Guo, Jiang, Wang, Yuxuan, Zheng, Bo, Zhao, Huaipeng, Liu, Yang, Teng, Dechuan, Liu, Ting, Lim, Kyungtae, Poibeau, Thierry, Sato, Motoki, Manabe, Hitoshi, Noji, Hiroshi, Matsumoto, Yuji, Kırnap, Ömer, Önder, Berkay Furkan, Yuret, Deniz, Straková, Jana, Vania, Clara, Zhang, Xingxing, Lopez, Adam, Heinecke, Johannes, Asadullah, Munshi, Kanerva, Jenna, Luotolahti, Juhani, Ginter, Filip, Kuan, Yu, Sofroniev, Pavel, Schill, Erik, Hinrichs, Erhard, Nguyen, Dat Quoc, Dras, Mark, Johnson, Mark, Qian, Xian, Vilares, David, Gómez-Rodríguez, Carlos, Aufrant, Lauriane, Wisniewski, Guillaume, Yvon, François, Dumitrescu, Stefan Daniel, Boroş, Tiberiu, Tufiş, Dan, Das, Ayan, Zaffar, Affan, Sarkar, Sudeshna, Wang, Hao, Zhao, Hai, Zhang, Zhisong, Hornby, Ryan, Taylor, Clark, Park, Jungyeul, de Lhoneux, Miryam, Shao, Yan, Basirat, Ali, Kiperwasser, Eliyahu, Stymne, Sara, Goldberg, Yoav, Nivre, Joakim, Akkuş, Burak Kerim, Azizoglu, Heval, Cakici, Ruket, Moor, Christophe, Merlo, Paola, Henderson, James, Wang, Haozhou, Ji, Tao, Wu, Yuanbin, Lan, Man, de la Clergerie, Eric, Sagot, Benoît, Seddah, Djamé, More, Amir, Tsarfaty, Reut, Kanayama, Hiroshi, Muraoka, Masayasu, Yoshikawa, Katsumasa, Garcia, Marcos, and Gamallo, Pablo
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- dependency parser and parsebank
- Language:
- Arabic, Bulgarian, Russia Buriat, Czech, Catalan, Church Slavic, Danish, German, Modern Greek (1453-), English, Spanish, Estonian, Basque, Persian, Finnish, French, Irish, Galician, Gothic, Ancient Greek (to 1453), Hebrew, Hindi, Croatian, Upper Sorbian, Hungarian, Indonesian, Italian, Japanese, Kazakh, Northern Kurdish, Korean, Latin, Latvian, Dutch, Norwegian, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Northern Sami, Swedish, Turkish, Uighur, Ukrainian, Urdu, Vietnamese, and Chinese
- Description:
- This package contains the system outputs from the CoNLL 2017 Shared Task in Multilingual Parsing from Raw Text to Universal Dependencies.
- Rights:
- Licence Universal Dependencies v2.0, https://lindat.mff.cuni.cz/repository/xmlui/page/licence-UD-2.0, and PUB
17. CoNLL 2018 Shared Task System Outputs
- Creator:
- Zeman, Daniel, Potthast, Martin, Duthoo, Elie, Mesnard, Olivier, Rybak, Piotr, Wróblewska, Alina, Che, Wanxiang, Liu, Yijia, Wang, Yuxuan, Zheng, Bo, Liu, Ting, Li, Zuchao, He, Shexia, Zhang, Zhuosheng, Zhao, Hai, Wu, Yingting, Tong, Jia-Jun, Nguyen, Dat Quoc, Verspoor, Karin, Wan, Hui, Naseem, Tahira, Lee, Young-Suk, Castelli, Vittorio, Ballesteros, Miguel, Hershcovich, Daniel, Abend, Omri, Rappoport, Ari, Smith, Aaron, Bohnet, Bernd, de Lhoneux, Miryam, Nivre, Joakim, Shao, Yan, Stymne, Sara, Kırnap, Ömer, Dayanık, Erenay, Yuret, Deniz, Kanerva, Jenna, Ginter, Filip, Miekka, Niko, Leino, Akseli, Salakoski, Tapio, Lim, KyungTae, Park, Cheoneum, Lee, Changki, Poibeau, Thierry, Bhat, Riyaz Ahmad, Bhat, Irshad, Bangalore, Srinivas, Qi, Peng, Dozat, Timothy, Zhang, Yuhao, Manning, Christopher, Boroș, Tiberiu, Dumitrescu, Stefan Daniel, Burtica, Ruxandra, Arakelyan, Gor, Hambardzumyan, Karen, Khachatrian, Hrant, Rosa, Rudolf, Mareček, David, Straka, Milan, Seker, Amit, More, Amir, Tsarfaty, Reut, Önder, Berkay Furkan, Gümeli, Can, Jawahar, Ganesh, Muller, Benjamin, Fethi, Amal, Martin, Louis, Villemonte de la Clergerie, Eric, Sagot, Benoît, Seddah, Djamé, Özateş, Şaziye Betül, Özgür, Arzucan, Gungor, Tunga, Öztürk, Balkız, Ji, Tao, Liu, Yufang, Wang, Yijun, Wu, Yuanbin, Lan, Man, Chen, Danlu, Lin, Mengxiao, Hu, Zhifeng, and Qiu, Xipeng
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- parsed data, conllu, and universal dependencies
- Language:
- Afrikaans, Arabic, Breton, Bulgarian, Russia Buriat, Catalan, Czech, Church Slavic, Danish, German, Modern Greek (1453-), English, Estonian, Basque, Faroese, Persian, Finnish, French, Old French (842-ca. 1400), Irish, Galician, Gothic, Ancient Greek (to 1453), Hebrew, Hindi, Croatian, Upper Sorbian, Hungarian, Armenian, Indonesian, Italian, Japanese, Kazakh, Northern Kurdish, Korean, Latin, Latvian, Dutch, Norwegian, Nigerian Pidgin, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Northern Sami, Spanish, Serbian, Swedish, Thai, Turkish, Uighur, Ukrainian, Urdu, Vietnamese, and Chinese
- Description:
- Test data parsed by systems submitted to the CoNLL 2018 UD parsing shared task.
- Rights:
- Licence Universal Dependencies v2.2, https://lindat.mff.cuni.cz/repository/xmlui/page/licence-UD-2.2, and PUB
18. Coreference in Universal Dependencies 0.1 (CorefUD 0.1)
- Creator:
- Nedoluzhko, Anna, Novák, Michal, Popel, Martin, Žabokrtský, Zdeněk, and Zeman, Daniel
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- dependency, treebank, coreference, bridging relations, and harmonized annotation
- Language:
- Catalan, Czech, Dutch, English, French, German, Hungarian, Lithuanian, Polish, Russian, and Spanish
- Description:
- CorefUD is a collection of previously existing datasets annotated with coreference, which we converted into a common annotation scheme. In total, CorefUD in its current version 0.1 consists of 17 datasets for 11 languages. The datasets are enriched with automatic morphological and syntactic annotations that are fully compliant with the standards of the Universal Dependencies project. All the datasets are stored in the CoNLL-U format, with coreference- and bridging-specific information captured by attribute-value pairs located in the MISC column. The collection is divided into a public edition and a non-public (ÚFAL-internal) edition. The publicly available edition is distributed via LINDAT-CLARIAH-CZ and contains 13 datasets for 10 languages (1 dataset for Catalan, 2 for Czech, 2 for English, 1 for French, 2 for German, 1 for Hungarian, 1 for Lithuanian, 1 for Polish, 1 for Russian, and 1 for Spanish), excluding the test data. The non-public edition is available internally to ÚFAL members and contains additional 4 datasets for 2 languages (1 dataset for Dutch, and 3 for English), which we are not allowed to distribute due to their original license limitations. It also contains the test data portions for all datasets. When using any of the harmonized datasets, please get acquainted with its license (placed in the same directory as the data) and cite the original data resource too. References to original resources whose harmonized versions are contained in the public edition of CorefUD 0.1: - Catalan-AnCora: Recasens, M. and Martí, M. A. (2010). AnCora-CO: Coreferentially Annotated Corpora for Spanish and Catalan. Language Resources and Evaluation, 44(4):315–345 - Czech-PCEDT: Nedoluzhko, A., Novák, M., Cinková, S., Mikulová, M., and Mírovský, J. (2016). Coreference in Prague Czech-English Dependency Treebank. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 169–176, Portorož, Slovenia. European Language Resources Association. - Czech-PDT: Hajič, J., Bejček, E., Hlaváčová, J., Mikulová, M., Straka, M., Štěpánek, J., and Štěpánková, B. (2020). Prague Dependency Treebank - Consolidated 1.0. In Proceedings of the 12th International Conference on Language Resources and Evaluation (LREC 2020), pages 5208–5218, Marseille, France. European Language Resources Association. - English-GUM: Zeldes, A. (2017). The GUM Corpus: Creating Multilayer Resources in the Classroom. Language Resources and Evaluation, 51(3):581–612. - English-ParCorFull: Lapshinova-Koltunski, E., Hardmeier, C., and Krielke, P. (2018). ParCorFull: a Parallel Corpus Annotated with Full Coreference. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan. European Language Resources Association. - French-Democrat: Landragin, F. (2016). Description, modélisation et détection automatique des chaı̂nes de référence (DEMOCRAT). Bulletin de l’Association Française pour l’Intelligence Artificielle, (92):11–15. - German-ParCorFull: Lapshinova-Koltunski, E., Hardmeier, C., and Krielke, P. (2018). ParCorFull: a Parallel Corpus Annotated with Full Coreference. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan. European Language Resources Association - German-PotsdamCC: Bourgonje, P. and Stede, M. (2020). The Potsdam Commentary Corpus 2.2: Extending annotations for shallow discourse parsing. In Proceedings of the 12th Language Resources and Evaluation Conference, pages 1061–1066, Marseille, France. European Language Resources Association. - Hungarian-SzegedKoref: Vincze, V., Hegedűs, K., Sliz-Nagy, A., and Farkas, R. (2018). SzegedKoref: A Hungarian Coreference Corpus. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan. European Language Resources Association. - Lithuanian-LCC: Žitkus, V. and Butkienė, R. (2018). Coreference Annotation Scheme and Corpus for Lithuanian Language. In Fifth International Conference on Social Networks Analysis, Management and Security, SNAMS 2018, Valencia, Spain, October 15-18, 2018, pages 243–250. IEEE. - Polish-PCC: Ogrodniczuk, M., Glowińska, K., Kopeć, M., Savary, A., and Zawisławska, M. (2013). Polish coreference corpus. In Human Language Technology. Challenges for Computer Science and Linguistics - 6th Language and Technology Conference, LTC 2013, Poznań, Poland, December 7-9, 2013. Revised Selected Papers, volume 9561 of Lecture Notes in Computer Science, pages 215–226. Springer. - Russian-RuCor: Toldova, S., Roytberg, A., Ladygina, A. A., Vasilyeva, M. D., Azerkovich, I. L., Kurzukov,M., Sim, G., Gorshkov, D. V., Ivanova, A., Nedoluzhko, A., and Grishina, Y. (2014). Evaluating Anaphora and Coreference Resolution for Russian. In Komp’juternaja lingvistika i intellektual’nye tehnologii. Po materialam ezhegodnoj Mezhdunarodnoj konferencii Dialog, pages 681–695. - Spanish-AnCora: Recasens, M. and Martí, M. A. (2010). AnCora-CO: Coreferentially Annotated Corpora for Spanish and Catalan. Language Resources and Evaluation, 44(4):315–345 References to original resources whose harmonized versions are contained in the ÚFAL-internal edition of CorefUD 0.1: - Dutch-COREA: Hendrickx, I., Bouma, G., Coppens, F., Daelemans, W., Hoste, V., Kloosterman, G., Mineur, A.-M., Van Der Vloet, J., and Verschelde, J.-L. (2008). A coreference corpus and resolution system for Dutch. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC’08), Marrakech, Morocco. European Language Resources Association. - English-ARRAU: Uryupina, O., Artstein, R., Bristot, A., Cavicchio, F., Delogu, F., Rodriguez, K. J., and Poesio, M. (2020). Annotating a broad range of anaphoric phenomena, in a variety of genres: the ARRAU Corpus. Natural Language Engineering, 26(1):95–128. - English-OntoNotes: Weischedel, R., Hovy, E., Marcus, M., Palmer, M., Belvin, R., Pradhan, S., Ramshaw, L., and Xue, N. (2011). Ontonotes: A large training corpus for enhanced processing. In Handbook of Natural Language Processing and Machine Translation: DARPA Global Autonomous Language Exploitation, pages 54–63, New York. Springer-Verlag. - English-PCEDT: Nedoluzhko, A., Novák, M., Cinková, S., Mikulová, M., and Mírovský, J. (2016). Coreference in Prague Czech-English Dependency Treebank. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), pages 169–176, Portorož, Slovenia. European Language Resources Association.
- Rights:
- Licence CorefUD v0.1, https://lindat.mff.cuni.cz/repository/xmlui/page/license-corefud-0.1, and PUB
19. Coreference in Universal Dependencies 0.2 (CorefUD 0.2)
- Creator:
- Nedoluzhko, Anna, Novák, Michal, Popel, Martin, Žabokrtský, Zdeněk, and Zeman, Daniel
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- dependency, treebank, coreference, bridging relations, and harmonized annotation
- Language:
- Catalan, Czech, Dutch, English, French, German, Hungarian, Lithuanian, Polish, Russian, and Spanish
- Description:
- CorefUD is a collection of previously existing datasets annotated with coreference, which we converted into a common annotation scheme. In total, CorefUD in its current version 0.2 consists of 17 datasets for 11 languages. The datasets are enriched with automatic morphological and syntactic annotations that are fully compliant with the standards of the Universal Dependencies project. All the datasets are stored in the CoNLL-U format, with coreference- and bridging-specific information captured by attribute-value pairs located in the MISC column. The collection is divided into a public edition and a non-public (ÚFAL-internal) edition. The publicly available edition is distributed via LINDAT-CLARIAH-CZ and contains 13 datasets for 10 languages (1 dataset for Catalan, 2 for Czech, 2 for English, 1 for French, 2 for German, 1 for Hungarian, 1 for Lithuanian, 1 for Polish, 1 for Russian, and 1 for Spanish), excluding the test data. The non-public edition is available internally to ÚFAL members and contains additional 4 datasets for 2 languages (1 dataset for Dutch, and 3 for English), which we are not allowed to distribute due to their original license limitations. It also contains the test data portions for all datasets. When using any of the harmonized datasets, please get acquainted with its license (placed in the same directory as the data) and cite the original data resource too. Version 0.2 consists of exactly the same datasets as the version 0.1. All automatically parsed datasets were re-parsed for v0.2 using UDPipe 2 with models trained on UD 2.6. Catalan-AnCora, Spanish-AnCora and English-GUM have been updated to match the their UD 2.9 versions.
- Rights:
- Licence CorefUD v0.2, https://lindat.mff.cuni.cz/repository/xmlui/page/license-corefud-0.2, and PUB
20. Coreference in Universal Dependencies 1.0 (CorefUD 1.0)
- Creator:
- Nedoluzhko, Anna, Novák, Michal, Popel, Martin, Žabokrtský, Zdeněk, Zeldes, Amir, Zeman, Daniel, Bourgonje, Peter, Cinková, Silvie, Hajič, Jan, Hardmeier, Christian, Krielke, Pauline, Landragin, Frédéric, Lapshinova-Koltunski, Ekaterina, Martí, M. Antònia, Mikulová, Marie, Ogrodniczuk, Maciej, Recasens, Marta, Stede, Manfred, Straka, Milan, Toldova, Svetlana, Vincze, Veronika, and Žitkus, Voldemaras
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- dependency, treebank, coreference, bridging relations, and harmonized annotation
- Language:
- Catalan, Czech, Dutch, English, French, German, Hungarian, Lithuanian, Polish, Russian, and Spanish
- Description:
- CorefUD is a collection of previously existing datasets annotated with coreference, which we converted into a common annotation scheme. In total, CorefUD in its current version 1.0 consists of 17 datasets for 11 languages. The datasets are enriched with automatic morphological and syntactic annotations that are fully compliant with the standards of the Universal Dependencies project. All the datasets are stored in the CoNLL-U format, with coreference- and bridging-specific information captured by attribute-value pairs located in the MISC column. The collection is divided into a public edition and a non-public (ÚFAL-internal) edition. The publicly available edition is distributed via LINDAT-CLARIAH-CZ and contains 13 datasets for 10 languages (1 dataset for Catalan, 2 for Czech, 2 for English, 1 for French, 2 for German, 1 for Hungarian, 1 for Lithuanian, 1 for Polish, 1 for Russian, and 1 for Spanish), excluding the test data. The non-public edition is available internally to ÚFAL members and contains additional 4 datasets for 2 languages (1 dataset for Dutch, and 3 for English), which we are not allowed to distribute due to their original license limitations. It also contains the test data portions for all datasets. When using any of the harmonized datasets, please get acquainted with its license (placed in the same directory as the data) and cite the original data resource too. Version 1.0 consists of the same corpora and languages as the previous version 0.2; however, the English GUM dataset has been updated to a newer and larger version, and in the Czech/English PCEDT dataset, the train-dev-test split has been changed to be compatible with OntoNotes. Nevertheless, the main change is in the file format (the MISC attributes have new form and interpretation).
- Rights:
- Licence CorefUD v0.2, https://lindat.mff.cuni.cz/repository/xmlui/page/license-corefud-0.2, and PUB
21. Coreference in Universal Dependencies 1.1 (CorefUD 1.1)
- Creator:
- Novák, Michal, Popel, Martin, Žabokrtský, Zdeněk, Zeman, Daniel, Nedoluzhko, Anna, Acar, Kutay, Bourgonje, Peter, Cinková, Silvie, Cebiroğlu Eryiğit, Gülşen, Hajič, Jan, Hardmeier, Christian, Haug, Dag, Jørgensen, Tollef, Kåsen, Andre, Krielke, Pauline, Landragin, Frédéric, Lapshinova-Koltunski, Ekaterina, Mæhlum, Petter, Martí, M. Antònia, Mikulová, Marie, Nøklestad, Anders, Ogrodniczuk, Maciej, Øvrelid, Lilja, Pamay Arslan, Tuğba, Recasens, Marta, Solberg, Per Erik, Stede, Manfred, Straka, Milan, Toldova, Svetlana, Vadász, Noémi, Velldal, Erik, Vincze, Veronika, Zeldes, Amir, and Žitkus, Voldemaras
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- dependency, treebank, coreference, bridging relations, and harmonized annotation
- Language:
- Catalan, Czech, English, French, German, Hungarian, Lithuanian, Norwegian, Polish, Russian, Spanish, and Turkish
- Description:
- CorefUD is a collection of previously existing datasets annotated with coreference, which we converted into a common annotation scheme. In total, CorefUD in its current version 1.1 consists of 21 datasets for 13 languages. The datasets are enriched with automatic morphological and syntactic annotations that are fully compliant with the standards of the Universal Dependencies project. All the datasets are stored in the CoNLL-U format, with coreference- and bridging-specific information captured by attribute-value pairs located in the MISC column. The collection is divided into a public edition and a non-public (ÚFAL-internal) edition. The publicly available edition is distributed via LINDAT-CLARIAH-CZ and contains 17 datasets for 12 languages (1 dataset for Catalan, 2 for Czech, 2 for English, 1 for French, 2 for German, 2 for Hungarian, 1 for Lithuanian, 2 for Norwegian, 1 for Polish, 1 for Russian, 1 for Spanish, and 1 for Turkish), excluding the test data. The non-public edition is available internally to ÚFAL members and contains additional 4 datasets for 2 languages (1 dataset for Dutch, and 3 for English), which we are not allowed to distribute due to their original license limitations. It also contains the test data portions for all datasets. When using any of the harmonized datasets, please get acquainted with its license (placed in the same directory as the data) and cite the original data resource too. Compared to the previous version 1.0, the version 1.1 comprises new languages and corpora, namely Hungarian-KorKor, Norwegian-BokmaalNARC, Norwegian-NynorskNARC, and Turkish-ITCC. In addition, the English GUM dataset has been updated to a newer and larger version, and the conversion pipelines for most datasets have been refined (a list of all changes in each dataset can be found in the corresponding README file).
- Rights:
- Licence CorefUD v1.1, https://lindat.mff.cuni.cz/repository/xmlui/page/license-corefud-1.1, and PUB
22. Coreference in Universal Dependencies 1.2 (CorefUD 1.2)
- Creator:
- Popel, Martin, Novák, Michal, Žabokrtský, Zdeněk, Zeman, Daniel, Nedoluzhko, Anna, Acar, Kutay, Bamman, David, Bourgonje, Peter, Cinková, Silvie, Eckhoff, Hanne, Cebiroğlu Eryiğit, Gülşen, Hajič, Jan, Hardmeier, Christian, Haug, Dag, Jørgensen, Tollef, Kåsen, Andre, Krielke, Pauline, Landragin, Frédéric, Lapshinova-Koltunski, Ekaterina, Mæhlum, Petter, Martí, M. Antònia, Mikulová, Marie, Nøklestad, Anders, Ogrodniczuk, Maciej, Øvrelid, Lilja, Pamay Arslan, Tuğba, Recasens, Marta, Solberg, Per Erik, Stede, Manfred, Straka, Milan, Swanson, Daniel, Toldova, Svetlana, Vadász, Noémi, Velldal, Erik, Vincze, Veronika, Zeldes, Amir, and Žitkus, Voldemaras
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- coreference, bridging relations, harmonized annotation, dependency, and treebank
- Language:
- Ancient Greek (to 1453), Ancient Hebrew, Catalan, Czech, English, French, German, Hungarian, Lithuanian, Norwegian, Church Slavic, Polish, Russian, Spanish, and Turkish
- Description:
- CorefUD is a collection of previously existing datasets annotated with coreference, which we converted into a common annotation scheme. In total, CorefUD in its current version 1.2 consists of 25 datasets for 16 languages. The datasets are enriched with automatic morphological and syntactic annotations that are fully compliant with the standards of the Universal Dependencies project. All the datasets are stored in the CoNLL-U format, with coreference- and bridging-specific information captured by attribute-value pairs located in the MISC column. The collection is divided into a public edition and a non-public (ÚFAL-internal) edition. The publicly available edition is distributed via LINDAT-CLARIAH-CZ and contains 21 datasets for 15 languages (1 dataset for Ancient Greek, 1 for Ancient Hebrew, 1 for Catalan, 2 for Czech, 3 for English, 1 for French, 2 for German, 2 for Hungarian, 1 for Lithuanian, 2 for Norwegian, 1 for Old Church Slavonic, 1 for Polish, 1 for Russian, 1 for Spanish, and 1 for Turkish), excluding the test data. The non-public edition is available internally to ÚFAL members and contains additional 4 datasets for 2 languages (1 dataset for Dutch, and 3 for English), which we are not allowed to distribute due to their original license limitations. It also contains the test data portions for all datasets. When using any of the harmonized datasets, please get acquainted with its license (placed in the same directory as the data) and cite the original data resource, too. Compared to the previous version 1.1, the version 1.2 comprises new languages and corpora, namely Ancient_Greek-PROIEL, Ancient_Hebrew-PTNK, English-LitBank, and Old_Church_Slavonic-PROIEL. In addition, English-GUM and Turkish-ITCC have been updated to newer versions, conversion of zeros in Polish-PCC has been improved, and the conversion pipelines for multiple other datasets have been refined (a list of all changes in each dataset can be found in the corresponding README file).
- Rights:
- Licence CorefUD v1.2, https://lindat.mff.cuni.cz/repository/xmlui/page/license-corefud-1.2, and PUB
23. CorPipe 23 multilingual CorefUD 1.1 model (corpipe23-corefud1.1-231206)
- Creator:
- Straka, Milan
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- tool and toolService
- Subject:
- coreference resolution, CorPipe, and CorefUD
- Language:
- Catalan, Czech, German, English, Spanish, French, Hungarian, Lithuanian, Norwegian Bokmål, Norwegian Nynorsk, Polish, Russian, and Turkish
- Description:
- The `corpipe23-corefud1.1-231206` is a `mT5-large`-based multilingual model for coreference resolution usable in CorPipe 23 (https://github.com/ufal/crac2023-corpipe). It is released under the CC BY-NC-SA 4.0 license. The model is language agnostic (no _corpus id_ on input), so it can be used to predict coreference in any `mT5` language (for zero-shot evaluation, see the paper). However, note that the empty nodes must be present already on input, they are not predicted (the same settings as in the CRAC23 shared task).
- Rights:
- Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB
24. DaMuEL 1.0: A Large Multilingual Dataset for Entity Linking
- Creator:
- Kubeša, David and Straka, Milan
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- entity linking, NEL, NER, dataset, and knowledge base
- Language:
- Afrikaans, Arabic, Armenian, Basque, Belarusian, Bulgarian, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, Galician, German, Hebrew, Hindi, Hungarian, Indonesian, Irish, Italian, Japanese, Korean, Latin, Latvian, Lithuanian, Maltese, Marathi, Modern Greek (1453-), Northern Sami, Norwegian Nynorsk, Persian, Polish, Portuguese, Romanian, Russian, Scottish Gaelic, Serbian, Slovak, Slovenian, Spanish, Swedish, Tamil, Telugu, Uighur, Ukrainian, Urdu, Vietnamese, and Wolof
- Description:
- We present DaMuEL, a large Multilingual Dataset for Entity Linking containing data in 53 languages. DaMuEL consists of two components: a knowledge base that contains language-agnostic information about entities, including their claims from Wikidata and named entity types (PER, ORG, LOC, EVENT, BRAND, WORK_OF_ART, MANUFACTURED); and Wikipedia texts with entity mentions linked to the knowledge base, along with language-specific text from Wikidata such as labels, aliases, and descriptions, stored separately for each language. The Wikidata QID is used as a persistent, language-agnostic identifier, enabling the combination of the knowledge base with language-specific texts and information for each entity. Wikipedia documents deliberately annotate only a single mention for every entity present; we further automatically detect all mentions of named entities linked from each document. The dataset contains 27.9M named entities in the knowledge base and 12.3G tokens from Wikipedia texts. The dataset is published under the CC BY-SA licence.
- Rights:
- Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0), http://creativecommons.org/licenses/by-sa/4.0/, and PUB
25. Deep Universal Dependencies 2.4
- Creator:
- Zeman, Daniel and Droganova, Kira
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- semantic dependency and universal dependencies
- Language:
- Afrikaans, Assyrian Neo-Aramaic, Akkadian, Amharic, Arabic, Belarusian, Breton, Bulgarian, Russia Buriat, Catalan, Czech, Church Slavic, Mandarin Chinese, Coptic, Welsh, Danish, German, Modern Greek (1453-), English, Estonian, Basque, Faroese, Finnish, French, Irish, Gothic, Ancient Greek (to 1453), Mbyá Guaraní, Hebrew, Hindi, Croatian, Upper Sorbian, Hungarian, Armenian, Indonesian, Italian, Japanese, Kazakh, Northern Kurdish, Korean, Komi-Zyrian, Karelian, Latin, Latvian, Lithuanian, Literary Chinese, Marathi, Erzya, Dutch, Norwegian, Old Russian, Nigerian Pidgin, Polish, Portuguese, Romanian, Russian, Sanskrit, Slovak, Slovenian, Northern Sami, Spanish, Serbian, Swedish, Tamil, Tagalog, Turkish, Ukrainian, Urdu, Vietnamese, Warlpiri, Wolof, Yoruba, and Galician
- Description:
- Deep Universal Dependencies is a collection of treebanks derived semi-automatically from Universal Dependencies (http://hdl.handle.net/11234/1-2988). It contains additional deep-syntactic and semantic annotations. Version of Deep UD corresponds to the version of UD it is based on. Note however that some UD treebanks have been omitted from Deep UD.
- Rights:
- Licence Universal Dependencies v2.4, https://lindat.mff.cuni.cz/repository/xmlui/page/licence-UD-2.4, and PUB
26. Deep Universal Dependencies 2.5
- Creator:
- Zeman, Daniel and Droganova, Kira
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- semantic dependency and universal dependencies
- Language:
- Afrikaans, Assyrian Neo-Aramaic, Akkadian, Amharic, Arabic, Belarusian, Breton, Bulgarian, Russia Buriat, Catalan, Czech, Church Slavic, Mandarin Chinese, Coptic, Welsh, Danish, German, Modern Greek (1453-), English, Estonian, Basque, Faroese, Finnish, French, Irish, Gothic, Ancient Greek (to 1453), Mbyá Guaraní, Hebrew, Hindi, Croatian, Upper Sorbian, Hungarian, Armenian, Indonesian, Italian, Japanese, Kazakh, Northern Kurdish, Korean, Komi-Zyrian, Karelian, Latin, Latvian, Lithuanian, Literary Chinese, Marathi, Erzya, Dutch, Norwegian, Old Russian, Nigerian Pidgin, Polish, Portuguese, Romanian, Russian, Sanskrit, Slovak, Slovenian, Northern Sami, Spanish, Serbian, Swedish, Tamil, Tagalog, Turkish, Ukrainian, Urdu, Vietnamese, Warlpiri, Wolof, Yoruba, Galician, Bhojpuri, Komi-Permyak, Livvi, Moksha, Scottish Gaelic, and Skolt Sami
- Description:
- Deep Universal Dependencies is a collection of treebanks derived semi-automatically from Universal Dependencies (http://hdl.handle.net/11234/1-3105). It contains additional deep-syntactic and semantic annotations. Version of Deep UD corresponds to the version of UD it is based on. Note however that some UD treebanks have been omitted from Deep UD.
- Rights:
- Licence Universal Dependencies v2.5, https://lindat.mff.cuni.cz/repository/xmlui/page/licence-UD-2.5, and PUB
27. Deep Universal Dependencies 2.6
- Creator:
- Zeman, Daniel and Droganova, Kira
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- semantic dependency and universal dependencies
- Language:
- Afrikaans, Assyrian Neo-Aramaic, Akkadian, Amharic, Arabic, Belarusian, Breton, Bulgarian, Russia Buriat, Catalan, Czech, Church Slavic, Mandarin Chinese, Coptic, Welsh, Danish, German, Modern Greek (1453-), English, Estonian, Basque, Faroese, Finnish, French, Irish, Gothic, Ancient Greek (to 1453), Mbyá Guaraní, Hebrew, Hindi, Croatian, Upper Sorbian, Hungarian, Armenian, Indonesian, Italian, Japanese, Kazakh, Northern Kurdish, Korean, Komi-Zyrian, Karelian, Latin, Latvian, Lithuanian, Literary Chinese, Marathi, Erzya, Dutch, Norwegian, Old Russian, Nigerian Pidgin, Polish, Portuguese, Romanian, Russian, Sanskrit, Slovak, Slovenian, Northern Sami, Spanish, Serbian, Swedish, Tamil, Tagalog, Turkish, Ukrainian, Urdu, Vietnamese, Warlpiri, Wolof, Yoruba, Galician, Bhojpuri, Komi-Permyak, Livvi, Moksha, Scottish Gaelic, Skolt Sami, Icelandic, Albanian, and Persian
- Description:
- Deep Universal Dependencies is a collection of treebanks derived semi-automatically from Universal Dependencies (http://hdl.handle.net/11234/1-3226). It contains additional deep-syntactic and semantic annotations. Version of Deep UD corresponds to the version of UD it is based on. Note however that some UD treebanks have been omitted from Deep UD.
- Rights:
- Licence Universal Dependencies v2.6, https://lindat.mff.cuni.cz/repository/xmlui/page/license-ud-2.6, and PUB
28. Deep Universal Dependencies 2.7
- Creator:
- Zeman, Daniel and Droganova, Kira
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- semantic dependency and universal dependencies
- Language:
- Afrikaans, Assyrian Neo-Aramaic, Akkadian, Amharic, Arabic, Belarusian, Breton, Bulgarian, Russia Buriat, Catalan, Czech, Church Slavic, Mandarin Chinese, Coptic, Welsh, Danish, German, Modern Greek (1453-), English, Estonian, Basque, Faroese, Finnish, French, Irish, Gothic, Ancient Greek (to 1453), Mbyá Guaraní, Hebrew, Hindi, Croatian, Upper Sorbian, Hungarian, Armenian, Indonesian, Italian, Japanese, Kazakh, Northern Kurdish, Korean, Komi-Zyrian, Karelian, Latin, Latvian, Lithuanian, Literary Chinese, Marathi, Erzya, Dutch, Norwegian, Old Russian, Nigerian Pidgin, Polish, Portuguese, Romanian, Russian, Sanskrit, Slovak, Slovenian, Northern Sami, Spanish, Serbian, Swedish, Tamil, Tagalog, Turkish, Ukrainian, Urdu, Vietnamese, Warlpiri, Wolof, Yoruba, Galician, Bhojpuri, Komi-Permyak, Livvi, Moksha, Scottish Gaelic, Skolt Sami, Icelandic, Albanian, Persian, Akuntsu, Apurinã, Khunsari, Manx, Mundurukú, Nayini, Soi, South Levantine Arabic, and Tupinambá
- Description:
- Deep Universal Dependencies is a collection of treebanks derived semi-automatically from Universal Dependencies (http://hdl.handle.net/11234/1-3424). It contains additional deep-syntactic and semantic annotations. Version of Deep UD corresponds to the version of UD it is based on. Note however that some UD treebanks have been omitted from Deep UD.
- Rights:
- Licence Universal Dependencies v2.7, https://lindat.mff.cuni.cz/repository/xmlui/page/license-ud-2.7, and PUB
29. Deep Universal Dependencies 2.8
- Creator:
- Zeman, Daniel and Droganova, Kira
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- semantic dependency and universal dependencies
- Language:
- Afrikaans, Assyrian Neo-Aramaic, Akkadian, Amharic, Arabic, Belarusian, Breton, Bulgarian, Russia Buriat, Catalan, Czech, Church Slavic, Mandarin Chinese, Coptic, Welsh, Danish, German, Modern Greek (1453-), English, Estonian, Basque, Faroese, Finnish, French, Irish, Gothic, Ancient Greek (to 1453), Mbyá Guaraní, Hebrew, Hindi, Croatian, Upper Sorbian, Hungarian, Armenian, Indonesian, Italian, Japanese, Kazakh, Northern Kurdish, Korean, Komi-Zyrian, Karelian, Latin, Latvian, Lithuanian, Literary Chinese, Marathi, Erzya, Dutch, Norwegian, Old Russian, Nigerian Pidgin, Polish, Portuguese, Romanian, Russian, Sanskrit, Slovak, Slovenian, Northern Sami, Spanish, Serbian, Swedish, Tamil, Tagalog, Turkish, Ukrainian, Urdu, Vietnamese, Warlpiri, Wolof, Yoruba, Galician, Bhojpuri, Komi-Permyak, Livvi, Moksha, Scottish Gaelic, Skolt Sami, Icelandic, Albanian, Persian, Akuntsu, Apurinã, Khunsari, Manx, Mundurukú, Nayini, Soi, South Levantine Arabic, Tupinambá, Beja, Western Frisian, Urubú-Kaapor, Kangri, K'iche', Low German, Makuráp, Western Armenian, and Central Siberian Yupik
- Description:
- Deep Universal Dependencies is a collection of treebanks derived semi-automatically from Universal Dependencies (http://hdl.handle.net/11234/1-3687). It contains additional deep-syntactic and semantic annotations. Version of Deep UD corresponds to the version of UD it is based on. Note however that some UD treebanks have been omitted from Deep UD.
- Rights:
- Licence Universal Dependencies v2.8, https://lindat.mff.cuni.cz/repository/xmlui/page/license-ud-2.8, and PUB
30. Deltacorpus
- Creator:
- Mareček, David, Yu, Zhiwei, Zeman, Daniel, and Žabokrtský, Zdeněk
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- part of speech, tagging, semi-supervised, and cross-language
- Language:
- Belarusian, Bosnian, Bulgarian, Czech, Serbo-Croatian, Croatian, Upper Sorbian, Macedonian, Polish, Russian, Slovak, Slovenian, Serbian, Ukrainian, Latvian, Lithuanian, Afrikaans, Danish, German, English, Faroese, Western Frisian, Swiss German, Icelandic, Limburgan, Luxembourgish, Low German, Dutch, Norwegian Nynorsk, Norwegian, Scots, Swedish, Yiddish, Aragonese, Asturian, Catalan, French, Galician, Haitian, Italian, Latin, Lombard, Neapolitan, Piemontese, Portuguese, Romanian, Spanish, Venetian, Walloon, Breton, Welsh, Scottish Gaelic, Irish, Modern Greek (1453-), Armenian, Albanian, Dimli (individual language), Persian, Gilaki, Kurdish, Tajik, Bengali, Bishnupriya, Gujarati, Fiji Hindi, Hindi, Marathi, Nepali (macrolanguage), Urdu, Amharic, Arabic, Egyptian Arabic, Hebrew, Estonian, Finnish, Hungarian, Basque, Georgian, Chuvash, Azerbaijani, Turkish, Uzbek, Kazakh, Tatar, Yakut, Korean, Mongolian, Telugu, Kannada, Malayalam, Tamil, Newari, Vietnamese, Indonesian, Javanese, Malagasy, Maori, Malay (macrolanguage), Pampanga, Sundanese, Tagalog, Waray (Philippines), Swahili (macrolanguage), Esperanto, Ido, Interlingua (International Auxiliary Language Association), and Volapük
- Description:
- Texts in 107 languages from the W2C corpus (http://hdl.handle.net/11858/00-097C-0000-0022-6133-9), first 1,000,000 tokens per language, tagged by the delexicalized tagger described in Yu et al. (2016, LREC, Portorož, Slovenia).
- Rights:
- Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0), http://creativecommons.org/licenses/by-sa/4.0/, and PUB
31. Deltacorpus 1.1
- Creator:
- Mareček, David, Yu, Zhiwei, Zeman, Daniel, and Žabokrtský, Zdeněk
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- part of speech, tagging, semi-supervised, and cross-language
- Language:
- Belarusian, Bosnian, Bulgarian, Czech, Serbo-Croatian, Croatian, Upper Sorbian, Macedonian, Polish, Russian, Slovak, Slovenian, Serbian, Ukrainian, Latvian, Lithuanian, Afrikaans, Danish, German, English, Faroese, Western Frisian, Swiss German, Icelandic, Limburgan, Luxembourgish, Low German, Dutch, Norwegian Nynorsk, Norwegian, Scots, Swedish, Yiddish, Aragonese, Asturian, Catalan, French, Galician, Haitian, Italian, Latin, Lombard, Neapolitan, Piemontese, Portuguese, Romanian, Spanish, Venetian, Walloon, Breton, Welsh, Scottish Gaelic, Irish, Modern Greek (1453-), Armenian, Albanian, Dimli (individual language), Persian, Gilaki, Kurdish, Tajik, Bengali, Bishnupriya, Gujarati, Fiji Hindi, Hindi, Marathi, Nepali (macrolanguage), Urdu, Amharic, Arabic, Egyptian Arabic, Hebrew, Estonian, Finnish, Hungarian, Basque, Georgian, Chuvash, Azerbaijani, Turkish, Uzbek, Kazakh, Tatar, Yakut, Korean, Mongolian, Telugu, Kannada, Malayalam, Tamil, Newari, Vietnamese, Indonesian, Javanese, Malagasy, Maori, Malay (macrolanguage), Pampanga, Sundanese, Tagalog, Waray (Philippines), Swahili (macrolanguage), Esperanto, Ido, Interlingua (International Auxiliary Language Association), and Volapük
- Description:
- Texts in 107 languages from the W2C corpus (http://hdl.handle.net/11858/00-097C-0000-0022-6133-9), first 1,000,000 tokens per language, tagged by the delexicalized tagger described in Yu et al. (2016, LREC, Portorož, Slovenia). Changes in version 1.1: 1. Universal Dependencies tagset instead of the older and smaller Google Universal POS tagset. 2. SVM classifier trained on Universal Dependencies 1.2 instead of HamleDT 2.0. 3. Balto-Slavic languages, Germanic languages and Romance languages were tagged by classifier trained only on the respective group of languages. Other languages were tagged by a classifier trained on all available languages. The "c7" combination from version 1.0 is no longer used.
- Rights:
- Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0), http://creativecommons.org/licenses/by-sa/4.0/, and PUB
32. Evropa a evropské dědictví do konce 19. století
- Creator:
- Jan Patočka
- Publisher:
- Str. 132–159. Stať. [Věnován o F. Fajfrovi k 80. narozeninám 1972 a B. Komárkové k 70. narozeninám 1973.]
- Type:
- Text
- Subject:
- 1975, 1979/25, 1981/6, 1981/7, 1988/28, 1988/31, 1988/32, 1988/33, 1988/34, 1994/7, 1996/4, 1996/7, 1998/3, 1999/8, 2001/9, 2002/21, 2006/1, 2007/1, 2008/3, be, bg, cs, de, en, es, fr, fulltext, hu, I/1979, it, lt, no, pl, ru, SS-3/PD-III, sv, and uk
- Language:
- Czech, English, Bulgarian, French, Italian, Lithuanian, Hungarian, German, Norwegian, Polish, Russian, Belarusian, Spanish, Swedish, and Ukrainian
- Rights:
- open access and Rights holder: Archiv Jana Patočky, z.s.
33. Grammaticvs :
- Type:
- text and sborníky jubilejní
- Subject:
- Lingvistika. Jazyky, Bibliografie. Katalogy, Erhart, Adolf,, slavisté, filologové, and české (československé) sborníky a kolektivní monografie
- Language:
- Czech, English, German, Polish, Russian, and Slovak
- Rights:
- unknown
34. HamleDT 3.0
- Creator:
- Zeman, Daniel, Mareček, David, Mašek, Jan, Popel, Martin, Ramasamy, Loganathan, Rosa, Rudolf, Štěpánek, Jan, and Žabokrtský, Zdeněk
- Publisher:
- Charles University
- Type:
- text and corpus
- Subject:
- annotated corpus, morphology, syntax, dependency, treebank, harmonized annotation, and common annotation style
- Language:
- Arabic, Basque, Bengali, Bulgarian, Catalan, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Modern Greek (1453-), Ancient Greek (to 1453), Hebrew, Hindi, Hungarian, Indonesian, Irish, Italian, Japanese, Latin, Persian, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Spanish, Swedish, Tamil, Telugu, and Turkish
- Description:
- HamleDT (HArmonized Multi-LanguagE Dependency Treebank) is a compilation of existing dependency treebanks (or dependency conversions of other treebanks), transformed so that they all conform to the same annotation style. This version uses Universal Dependencies as the common annotation style. Update (November 1017): for a current collection of harmonized dependency treebanks, we recommend using the Universal Dependencies (UD). All of the corpora that are distributed in HamleDT in full are also part of the UD project; only some corpora from the Patch group (where HamleDT provides only the harmonizing scripts but not the full corpus data) are available in HamleDT but not in UD.
- Rights:
- HamleDT 3.0 License Terms, https://lindat.mff.cuni.cz/repository/xmlui/page/licence-hamledt-3.0, and PUB
35. Henryk Sienkiewicz :
- Publisher:
- Ostravská univerzita, Filozofická fakulta,
- Type:
- sborníky konferenční
- Subject:
- Polská literatura (o ní), Sienkiewicz, Henryk,, spisovatelé polští, literatura polská, Polsko, světové dějiny 1789-1918, and literatura, spisovatelé
- Language:
- Polish, Czech, Russian, and Ukrainian
- Description:
- Částečně český, ukrajinský a ruský text.
- Rights:
- unknown
36. IWPT 2020 Shared Task Data and System Outputs
- Creator:
- Zeman, Daniel, Bouma, Gosse, and Seddah, Djamé
- Publisher:
- Universal Dependencies Consortium
- Type:
- text and corpus
- Subject:
- treebank, dependency, syntax, enhanced universal dependencies, shared task, and parsing
- Language:
- Arabic, Bulgarian, Czech, Dutch, English, Estonian, Finnish, French, Italian, Latvian, Lithuanian, Polish, Russian, Slovak, Swedish, Tamil, and Ukrainian
- Description:
- This package contains data used in the IWPT 2020 shared task. It contains training, development and test (evaluation) datasets. The data is based on a subset of Universal Dependencies release 2.5 (http://hdl.handle.net/11234/1-3105) but some treebanks contain additional enhanced annotations. Moreover, not all of these additions became part of Universal Dependencies release 2.6 (http://hdl.handle.net/11234/1-3226), which makes the shared task data unique and worth a separate release to enable later comparison with new parsing algorithms. The package also contains a number of Perl and Python scripts that have been used to process the data during preparation and during the shared task. Finally, the package includes the official primary submission of each team participating in the shared task.
- Rights:
- Licence Universal Dependencies v2.5, https://lindat.mff.cuni.cz/repository/xmlui/page/licence-UD-2.5, and PUB
37. IWPT 2021 Shared Task Data and System Outputs
- Creator:
- Zeman, Daniel, Bouma, Gosse, and Seddah, Djamé
- Publisher:
- Universal Dependencies Consortium
- Type:
- text and corpus
- Subject:
- treebank, dependency, syntax, enhanced universal dependencies, shared task, and parsing
- Language:
- Arabic, Bulgarian, Czech, Dutch, English, Estonian, Finnish, French, Italian, Latvian, Lithuanian, Polish, Russian, Slovak, Swedish, Tamil, and Ukrainian
- Description:
- This package contains data used in the IWPT 2021 shared task. It contains training, development and test (evaluation) datasets. The data is based on a subset of Universal Dependencies release 2.7 (http://hdl.handle.net/11234/1-3424) but some treebanks contain additional enhanced annotations. Moreover, not all of these additions became part of Universal Dependencies release 2.8 (http://hdl.handle.net/11234/1-3687), which makes the shared task data unique and worth a separate release to enable later comparison with new parsing algorithms. The package also contains a number of Perl and Python scripts that have been used to process the data during preparation and during the shared task. Finally, the package includes the official primary submission of each team participating in the shared task.
- Rights:
- Licence Universal Dependencies v2.7, https://lindat.mff.cuni.cz/repository/xmlui/page/license-ud-2.7, and PUB
38. Jan Hus 1415 a 600 let poté :
- Type:
- text and sborníky konferenční
- Subject:
- Dějiny křesťanské církve, Hus, Jan,, dějiny církevní, reformátoři, teologie, české (československé) sborníky a kolektivní monografie, české země 1306-1419, přehledná zpracování dějin českých zemí (chronologicky), and církevní a náboženské dějiny
- Language:
- Czech, English, Polish, Russian, and Slovak
- Description:
- 1000 výtisků
- Rights:
- unknown
39. Jana Amosa Komeńskiego przebudowa świata (prolegomena) =
- Creator:
- Šimonik, Vaclav
- Type:
- text and studie
- Subject:
- Věda. Všeobecnosti. Základy vědy a kultury. Vědecká práce, Komenský, Jan Amos,, komeniologie, myšlení filozofické, myšlení pedagogické, české země 1526-1792, and dějiny vědy, umění, kultury a techniky, kulturní vztahy
- Language:
- Polish and Russian
- Description:
- Ruské resumé
- Rights:
- unknown
40. Jazykové právo a slovanské jazyky /
- Publisher:
- Filozofická fakulta Univerzity Karlovy,
- Type:
- monografie kolektivní
- Subject:
- Ústavní právo. Správní právo, právo jazykové, jazyky slovanské, země slovanské, české (československé) sborníky a kolektivní monografie, světové dějiny od r. 1945 do současnosti, ústavní a právní dějiny, and jazyk, písmo
- Language:
- Czech, Belarusian, Bulgarian, Croatian, Macedonian, Polish, Russian, Slovak, Slovenian, Serbian, and Ukrainian
- Rights:
- unknown
41. Je technická civilizace úpadková a proč?
- Creator:
- Jan Patočka
- Publisher:
- Str. 160–203. Stať.
- Type:
- Text
- Subject:
- 1975, 1979/25, 1980/27, 1981/6, 1981/7, 1988/28, 1988/31, 1988/32, 1988/34, 1994/7, 1996/4, 1996/7, 1997/7, 1998/3, 1999/8, 2001/9, 2002/21, 2006/1, 2007/1, 2008/3, be, bg, cs, de, en, es, fr, fulltext, hu, it, lt, no, pl, ru, sl, sr, SS-3/PD-III, sv, and uk
- Language:
- Czech, English, Bulgarian, French, Italian, Lithuanian, Hungarian, German, Norwegian, Polish, Russian, Belarusian, Slovenian, Serbian, Spanish, Swedish, and Ukrainian
- Rights:
- open access and Rights holder: Archiv Jana Patočky, z.s.
42. Josef Dobrovský 1753-1829 :
- Type:
- text and sborníky
- Subject:
- Dějiny Česka a Slovenska, Dobrovský, Josef,, sborníky, výročí, obrozenci, and české (československé) sborníky a kolektivní monografie
- Language:
- Czech, German, Polish, Russian, and Serbian
- Description:
- Vydal Slovanský seminář University Karlovy v Praze
- Rights:
- unknown
43. Kapitoly z dejín Podkarpatskej Rusi 1919-1945 /
- Type:
- text and monografie kolektivní
- Subject:
- Dějiny zemí východní Evropy, dějiny zemí, Československo 1918-1945, and politické dějiny, politici
- Language:
- Slovak, Russian, Czech, and Polish
- Description:
- Venované životnému jubileu 80. narodeninám Dr.h.c. prof. PhDr. Michala Daniláka, CSc.
- Rights:
- unknown
44. Komparatistika, genologie, translatologie. Krystyna Kardyni-Pelikánová.
- Publisher:
- Masarykova univerzita,
- Subject:
- Kardyni-Pelikánová, Krystyna,, sborníky jubilejní, slavisté, slavistika, bohemistika, polonistika, literatura, přehledná zpracování světových dějin (chronologicky), Polsko, české (československé) sborníky a kolektivní monografie, přehledná zpracování dějin českých zemí (chronologicky), and dějiny slavistiky
- Language:
- Polish, Czech, and Russian
- Rights:
- unknown
45. Korespondence T. G. Masaryk - Slované.
- Creator:
- Masaryk, Tomáš Garrigue,
- Type:
- text, korespondence, and edice
- Subject:
- Politika, Biografie, Masaryk, Tomáš Garrigue,, politici čeští, prezidenti českoslovenští, vztahy mezinárodní, Slované, Rusové, and Ukrajinci
- Language:
- Czech, English, French, German, Polish, Russian, and Ukrainian
- Description:
- Název v tiráži: Korespondence TGM :, Nad názvem: TGM - MZ, Obsahuje doplňky k 1. svazku, seznam korespondence, anotovaných dokumentů a jmenný rejstřík, Nad názvem: TGM - AVA, and T. G. Masaryk anf The Slavs
- Rights:
- unknown
46. Korespondence T. G. Masaryk - Slované.
- Creator:
- Masaryk, Tomáš Garrigue,
- Type:
- text, korespondence, and edice
- Subject:
- Politika, Biografie, Masaryk, Tomáš Garrigue,, politici čeští, prezidenti českoslovenští, vztahy mezinárodní, Slované, Poláci, Rusové, and Ukrajinci
- Language:
- Czech, English, French, German, Polish, Russian, and Ukrainian
- Description:
- Název v tiráži: Korespondence TGM :, Nad názvem: TGM - MZ, and T. G. Masaryk anf The Slavs
- Rights:
- unknown
47. Křižovatky Slovanů /
- Type:
- text and monografie kolektivní
- Subject:
- Filologie, slavistika, Slované, české (československé) sborníky a kolektivní monografie, české země od r. 1993 do současnosti, světové dějiny od r. 1945 do současnosti, and dějiny slavistiky
- Language:
- Czech, English, Croatian, Polish, Russian, Slovak, and Ukrainian
- Description:
- Konference Praha, 6-7.11.2014
- Rights:
- unknown
48. Mají dějiny smysl?
- Creator:
- Jan Patočka
- Publisher:
- Str. 89–131. Stať. [Součástí eseje i text To platí též..., v. 1988/25H.]
- Type:
- Text
- Subject:
- 1975, 1979/25, 1981/6, 1981/7, 1988/25H, 1988/28, 1988/31, 1988/32, 1988/34, 1994/7, 1996/4, 1996/7, 1998/3, 1999/8, 2, 2001/9, 2002/21, 2002/7, 2006/1, 2007/1, 2008/3, bg, cs, de, en, es, fr, fulltext, hu, it, jp, lt, no, pl, ru, SS-3/PD-III, sv, uk, and v
- Language:
- Czech, English, Bulgarian, French, Italian, Lithuanian, Hungarian, German, Norwegian, Polish, Russian, Spanish, Swedish, and Ukrainian
- Rights:
- open access and Rights holder: Archiv Jana Patočky, z.s.
49. Na křižovatce umění :
- Type:
- text and sborníky jubilejní
- Subject:
- Literatura různých forem a žánrů (o ní), Závodský, Artur,, věda literární, literatura, film a filmy, divadlo, média, slavistika, and české (československé) sborníky a kolektivní monografie
- Language:
- Czech, French, German, Polish, Russian, and Slovak
- Rights:
- unknown
50. Napoleonské války a historická paměť :
- Type:
- text and sborníky konferenční
- Subject:
- Historická věda. Pomocné vědy historické. Archivnictví, sborníky konferenční, války napoleonské (1803-1815), vědomí historické, vojenské operace, války, bitvy, české země 1792-1847, and české (československé) sborníky a kolektivní monografie
- Language:
- Czech, German, Polish, and Russian
- Rights:
- unknown
51. Napoleonské války a historická paměť :
- Type:
- text and sborníky konferenční
- Subject:
- Historická věda. Pomocné vědy historické. Archivnictví, války napoleonské (1803-1815), vědomí historické, vojenské operace, války, bitvy, české (československé) sborníky a kolektivní monografie, and české země 1792-1847
- Language:
- Czech, German, Polish, and Russian
- Rights:
- unknown
52. Národy - města - lidé - traumata /
- Creator:
- Soukupová, Blanka,
- Type:
- text and monografie kolektivní
- Subject:
- Globální společnosti. Sociální struktura. Sociální skupiny, vědomí historické, obyvatelstvo městské, paměť historická, and české (československé) sborníky a kolektivní monografie
- Language:
- Czech, English, Polish, Russian, and Slovak
- Description:
- Na obálce vročení: 2018 and Nations - Cities - People - Trauma.
- Rights:
- unknown
53. Národy - města - lidé - traumata /
- Creator:
- Soukupová, Blanka,
- Type:
- text and monografie kolektivní
- Subject:
- Globální společnosti. Sociální struktura. Sociální skupiny, vědomí historické, obyvatelstvo městské, paměť historická, and české (československé) sborníky a kolektivní monografie
- Language:
- Czech, English, Polish, Russian, and Slovak
- Description:
- Na obálce vročení: 2018 and Nations - Cities - People - Trauma.
- Rights:
- unknown
54. OmegaWiki
- Publisher:
- Universität Bamberg, World Language Documentation Centre
- Format:
- application/octet-stream
- Type:
- lexicalConceptualResource
- Language:
- Afrikaans, Arabic, Basque, Bulgarian, Catalan, Chinese, Czech, Danish, Dutch, English, Esperanto, Estonian, Finnish, French, Galician, Georgian, Modern Greek (1453-), Hebrew, Hungarian, Icelandic, Indonesian, Interlingua (International Auxiliary Language Association), Irish, Italian, Japanese, Khmer, Norwegian, Polish, Portuguese, Romanian, Russian, Serbian, Slovak, Spanish, Swedish, Turkish, Ukrainian, and Welsh
- Rights:
- GFDL or CC and http://www.omegawiki.org/Licensing
55. Paměť - národ - menšiny - marginalizace - identity.
- Type:
- text and monografie kolektivní
- Subject:
- Sociální interakce. Sociální komunikace, paměť sociální, menšiny národnostní, identity, and české (československé) sborníky a kolektivní monografie
- Language:
- Czech, English, Polish, Russian, and Slovak
- Rights:
- unknown
56. Paměť - národ - menšiny - marginalizace - identity.
- Type:
- text and monografie kolektivní
- Subject:
- Sociální interakce. Sociální komunikace, paměť sociální, menšiny národnostní, identity, and české (československé) sborníky a kolektivní monografie
- Language:
- Czech, English, Polish, Russian, and Slovak
- Rights:
- unknown
57. ParaCrawl Corpus version 1.0
- Creator:
- Koehn, Philipp, Heafield, Kenneth, Forcada, Mikel L., Esplà-Gomis, Miquel, Ortiz-Rojas, Sergio, Sánchez, Gema Ramírez, Cartagena, Víctor M. Sánchez, Haddow, Barry, Bañón, Marta, Střelec, Marek, Samiotou, Anna, and Kamran, Amir
- Publisher:
- ParaCrawl
- Type:
- text and corpus
- Subject:
- ParaCrawl, parallel corpus, CommonCrawl, machine translation, and text corpora
- Language:
- English, German, French, Spanish, Italian, Portuguese, Dutch, Polish, Czech, Romanian, Finnish, Latvian, Russian, and Estonian
- Description:
- The January 2018 release of the ParaCrawl is the first version of the corpus. It contains parallel corpora for 11 languages paired with English, crawled from a large number of web sites. The selection of websites is based on CommonCrawl, but ParaCrawl is extracted from a brand new crawl which has much higher coverage of these selected websites than CommonCrawl. Since the data is fairly raw, it is released with two quality metrics that can be used for corpus filtering. An official "clean" version of each corpus uses one of the metrics. For more details and raw data download please visit: http://paracrawl.eu/releases.html
- Rights:
- Public Domain Dedication (CC Zero), http://creativecommons.org/publicdomain/zero/1.0/, and PUB
58. PAWS
- Creator:
- Nedoluzhko, Anna, Novák, Michal, and Ogrodniczuk, Maciej
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- multilingual, parallel corpus, coreference, and tectogrammatics
- Language:
- English, Czech, Russian, and Polish
- Description:
- PAWS is a multi-lingual parallel treebank with coreference annotation. It consists of English texts from the Wall Street Journal translated into Czech, Russian and Polish. In addition, the texts are syntactically parsed and word-aligned. PAWS is based on PCEDT 2.0 and continues the tradition of multilingual treebanks with coreference annotation. PAWS offers linguistic material that can be further leveraged in cross-lingual studies, especially on coreference.
- Rights:
- PAWS License, https://lindat.mff.cuni.cz/repository/xmlui/page/license-PAWS, and RES
59. Plaintext Wikipedia dump 2018
- Creator:
- Rosa, Rudolf
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- Wikipedia, text corpora, and monolingual corpus
- Language:
- Abkhazian, Achinese, Adyghe, Afrikaans, Akan, Tosk Albanian, Amharic, Old English (ca. 450-1100), Arabic, Official Aramaic (700-300 BCE), Aragonese, Egyptian Arabic, Assamese, Asturian, Atikamekw, Avaric, Aymara, South Azerbaijani, Azerbaijani, Bashkir, Bambara, Bavarian, Central Bikol, Belarusian, Bengali, Bislama, Banjar, Tibetan, Bosnian, Bishnupriya, Breton, Buginese, Bulgarian, Russia Buriat, Catalan, Min Dong Chinese, Cebuano, Czech, Chamorro, Chechen, Cherokee, Church Slavic, Chuvash, Cheyenne, Central Kurdish, Cornish, Corsican, Cree, Crimean Tatar, Kashubian, Welsh, Danish, German, Dinka, Dimli (individual language), Dhivehi, Lower Sorbian, Dzongkha, Modern Greek (1453-), English, Esperanto, Estonian, Basque, Ewe, Extremaduran, Faroese, Persian, Fijian, Finnish, French, Arpitan, Northern Frisian, Western Frisian, Fulah, Friulian, Gagauz, Gan Chinese, Scottish Gaelic, Irish, Galician, Gilaki, Manx, Goan Konkani, Gothic, Guarani, Gujarati, Hakka Chinese, Haitian, Hausa, Hawaiian, Serbo-Croatian, Hebrew, Herero, Fiji Hindi, Hindi, Hiri Motu, Croatian, Upper Sorbian, Hungarian, Armenian, Igbo, Ido, Inuktitut, Interlingue, Iloko, Interlingua (International Auxiliary Language Association), Indonesian, Inupiaq, Icelandic, Italian, Jamaican Creole English, Javanese, Lojban, Japanese, Kara-Kalpak, Kabyle, Kalaallisut, Kannada, Kashmiri, Georgian, Kanuri, Kazakh, Kabardian, Kabiyè, Khmer, Kikuyu, Kinyarwanda, Kirghiz, Komi-Permyak, Komi, Kongo, Korean, Karachay-Balkar, Kölsch, Kurdish, Ladino, Lao, Latin, Latvian, Lak, Lezghian, Ligurian, Limburgan, Lingala, Lithuanian, Lombard, Northern Luri, Latgalian, Luxembourgish, Ganda, Literary Chinese, Marshallese, Maithili, Malayalam, Marathi, Moksha, Eastern Mari, Minangkabau, Macedonian, Malagasy, Maltese, Mongolian, Maori, Western Mari, Malay (macrolanguage), Creek, Mirandese, Burmese, Erzya, Mazanderani, Min Nan Chinese, Neapolitan, Nauru, Navajo, Ndonga, Low German, Nepali (macrolanguage), Newari, Dutch, Norwegian Nynorsk, Norwegian, Novial, Pedi, Nyanja, Occitan (post 1500), Livvi, Oriya (macrolanguage), Oromo, Ossetian, Pangasinan, Pampanga, Panjabi, Papiamento, Picard, Pennsylvania German, Pfaelzisch, Pitcairn-Norfolk, Pali, Piemontese, Western Panjabi, Pontic, Polish, Portuguese, Pushto, Quechua, Vlax Romani, Romansh, Romanian, Rusyn, Rundi, Macedo-Romanian, Russian, Sango, Yakut, Sanskrit, Sicilian, Scots, Samogitian, Sinhala, Slovak, Slovenian, Northern Sami, Samoan, Shona, Sindhi, Somali, Southern Sotho, Spanish, Albanian, Sardinian, Sranan Tongo, Serbian, Swati, Saterfriesisch, Sundanese, Swahili (macrolanguage), Swedish, Silesian, Tahitian, Tamil, Tatar, Tulu, Telugu, Tama (Colombia), Tetum, Tajik, Tagalog, Thai, Tigrinya, Tonga (Tonga Islands), Tok Pisin, Tswana, Tsonga, Turkmen, Tumbuka, Turkish, Twi, Tuvinian, Udmurt, Uighur, Ukrainian, Urdu, Uzbek, Venetian, Venda, Veps, Vietnamese, Vlaams, Volapük, Võro, Waray (Philippines), Walloon, Wolof, Wu Chinese, Kalmyk, Xhosa, Mingrelian, Yiddish, Yoruba, Yue Chinese, Zeeuws, Zhuang, Chinese, Zulu, and Dotyali
- Description:
- Wikipedia plain text data obtained from Wikipedia dumps with WikiExtractor in February 2018. The data come from all Wikipedias for which dumps could be downloaded at [https://dumps.wikimedia.org/]. This amounts to 297 Wikipedias, usually corresponding to individual languages and identified by their ISO codes. Several special Wikipedias are included, most notably "simple" (Simple English Wikipedia) and "incubator" (tiny hatching Wikipedias in various languages). For a list of all the Wikipedias, see [https://meta.wikimedia.org/wiki/List_of_Wikipedias]. The script which can be used to get new version of the data is included, but note that Wikipedia limits the download speed for downloading a lot of the dumps, so it takes a few days to download all of them (but one or a few can be downloaded fast). Also, the format of the dumps changes time to time, so the script will probably eventually stop working one day. The WikiExtractor tool [http://medialab.di.unipi.it/wiki/Wikipedia_Extractor] used to extract text from the Wikipedia dumps is not mine, I only modified it slightly to produce plaintext outputs [https://github.com/ptakopysk/wikiextractor].
- Rights:
- Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0), http://creativecommons.org/licenses/by-sa/3.0/, and PUB
60. Počátek dějin
- Creator:
- Jan Patočka
- Publisher:
- Str. 46–88. Stať.
- Type:
- Text
- Subject:
- 1975, 1979/25, 1981/6, 1981/7, 1988/28, 1988/31, 1988/32, 1988/34, 1994/7, 1996/4, 1996/7, 1998/3, 1999/8, 2, 2001/9, 2002/21, 2002/6, 2006/1, 2007/1, 2008/3, bg, cs, de, en, es, fr, fulltext, hu, it, lt, no, pl, ru, SS-3/PD-III, sv, uk, and v
- Language:
- Czech, English, Bulgarian, French, Italian, Lithuanian, Hungarian, German, Norwegian, Polish, Russian, Spanish, Swedish, and Ukrainian
- Rights:
- open access and Rights holder: Archiv Jana Patočky, z.s.
61. Pre-historické úvahy
- Creator:
- Jan Patočka
- Publisher:
- Str. 1–45. Stať.
- Type:
- Text
- Subject:
- 1975, 1979/25, 1981/6, 1981/7, 1988/28, 1988/31, 1988/32, 1988/34, 1994/7, 1996/4, 1996/7, 1998/3, 1999/8, 2001/9, 2002/1, 2002/21, 2002/5, 2006/1, 2007/1, 2008/3, bg, cs, de, en, es, fr, fulltext, hu, it, lt, no, pl, ru, SS-3/PD-III, sv, and uk
- Language:
- Czech, English, Bulgarian, French, Italian, Lithuanian, Hungarian, German, Norwegian, Polish, Russian, Spanish, Swedish, and Ukrainian
- Rights:
- open access and Rights holder: Archiv Jana Patočky, z.s.
62. Prolínání slovanských prostředí /
- Type:
- text and sborníky konferenční
- Subject:
- Filologie, slavistika, and české (československé) sborníky a kolektivní monografie
- Language:
- Czech, Belarusian, Croatian, Macedonian, Polish, Russian, Slovak, and Ukrainian
- Description:
- Sborník studií vycházejících z referátů 7. roč. Konference mladých slavistů konané v r. 2011 na FF UK v Praze and Rok konání konference uveden na hřbetu pub.
- Rights:
- unknown
63. Prolínání slovanských prostředí /
- Type:
- text and sborníky konferenční
- Subject:
- Filologie, slavistika, and české (československé) sborníky a kolektivní monografie
- Language:
- Czech, Belarusian, Croatian, Macedonian, Polish, Russian, Slovak, and Ukrainian
- Description:
- Sborník studií vycházejících z referátů 7. roč. Konference mladých slavistů konané v r. 2011 na FF UK v Praze and Rok konání konference uveden na hřbetu pub.
- Rights:
- unknown
64. Publikace o Československu v cizích jazycích =
- Type:
- text and bibliografie
- Subject:
- Bibliografie. Katalogy, bohemika cizojazyčná, dějiny československé, and všeobecné bibliografie
- Language:
- Czech, Bosnian, English, French, German, Italian, Polish, Russian, and Romanian
- Rights:
- unknown
65. Rukopisy královédvorský a zelenohorský v kultuře a umění /
- Type:
- text and monografie kolektivní
- Subject:
- Česká literatura (o ní), Rukopis královédvorský, Rukopis zelenohorský, hnutí národní, české, identita národní, dějiny umění, dějiny kultury, falza, rukopisy, české země 1792-1918, Československo 1918-1992, dějiny literatury, jazyka a knihy, and národnosti, vztahy mezi národnostmi a národní hnutí
- Language:
- Czech, English, Finnish, German, Italian, Polish, Russian, Swedish, and Ukrainian
- Description:
- The Manuscripts of Dvůr Králové and Zelená Hora.
- Rights:
- unknown
66. Rukopisy královédvorský a zelenohorský v kultuře a umění /
- Type:
- text and monografie kolektivní
- Subject:
- Česká literatura (o ní), Rukopis královédvorský, Rukopis zelenohorský, hnutí národní, české, identita národní, dějiny umění, dějiny kultury, falza, rukopisy, české země 1792-1918, Československo 1918-1992, dějiny literatury, jazyka a knihy, and národnosti, vztahy mezi národnostmi a národní hnutí
- Language:
- Czech, English, Finnish, German, Italian, Polish, Russian, Swedish, and Ukrainian
- Description:
- The Manuscripts of Dvůr Králové and Zelená Hora.
- Rights:
- unknown
67. Rusko a slovanský svět :
- Type:
- text and sborníky konferenční
- Subject:
- Dějiny zemí východní Evropy, Mezinárodní vztahy, světová politika, vztahy mezinárodní, vztahy kulturní, Slované, and české (československé) sborníky a kolektivní monografie
- Language:
- Czech, Slovak, Russian, and Polish
- Description:
- Na hřbetu označení: 2020 and Studie nahlížejí na vztahy mezi Rusy a dalšími slovanskými národy v devatenáctém i minulém století z hlediska dějin, kultury, jazyka a literatury.
- Rights:
- unknown
68. Rusko a slovanský svět :
- Type:
- text and sborníky konferenční
- Subject:
- Dějiny zemí východní Evropy, Mezinárodní vztahy, světová politika, vztahy mezinárodní, vztahy kulturní, Slované, and české (československé) sborníky a kolektivní monografie
- Language:
- Czech, Slovak, Russian, and Polish
- Description:
- Na hřbetu označení: 2020 and Studie nahlížejí na vztahy mezi Rusy a dalšími slovanskými národy v devatenáctém i minulém století z hlediska dějin, kultury, jazyka a literatury.
- Rights:
- unknown
69. Russkija posol'stva v Čechii v XVI veke :
- Creator:
- Francev, Vladimir Andrejevič,
- Type:
- text and studie
- Subject:
- Filologie, vztahy česko-ruské, české země 1526-1792, zahraniční politika, mezinárodní vztahy, Rusko, and světové dějiny novověku (1492-1918)
- Language:
- Undetermined, Czech, Polish, and Russian
- Rights:
- unknown
70. Scholars of Bohemian, Czech and Czechoslovak history studies
- Creator:
- Pánek, Jaroslav,
- Type:
- text and slovníky biografické
- Subject:
- Historická věda. Pomocné vědy historické. Archivnictví, historici, bohemisté, bibliografie personální, biografické slovníky, and personální bibliografie
- Language:
- English, Polish, and Russian
- Description:
- Částečně angl., čes., fr., něm., pol., rus. a sloven. text
- Rights:
- unknown
71. Sebrané spisy Václava Machka /
- Creator:
- Machek, Václav,
- Type:
- text, spisy, and studie
- Subject:
- Filologie, Machek, Václav,, etymologie, etymologové, lingvisté, spisy sebrané, Československo 1918-1992, and jazyk, písmo
- Language:
- Czech, Bulgarian, English, French, German, Latin, Polish, Russian, and Slovak
- Description:
- Z technických důvodů vydáno ve 2 sv. and Obsahuje též: seznam přednášek a seminářů vedených Václavem Machkem na Filozofické fakultě v Brně
- Rights:
- unknown
72. Septuaginta Paolo Spunar oblata :
- Publisher:
- KLP,
- Type:
- sborníky jubilejní
- Subject:
- Dějiny civilizace. Kulturní dějiny, Spunar, Pavel,, jubilea životní, historici čeští, and české (československé) sborníky a kolektivní monografie
- Language:
- Czech, German, French, English, Russian, and Polish
- Description:
- Vydalo nakl. KLP-Koniasch Latin Press ve spolupráci s Ústavem pro klasická studia AV ČR, Kromě běžného nákladu bylo vydáno 50 ručně vázaných výt., Český, francouzský, anglický, ruský a polský text, and Anglická, francouzská a německá resumé
- Rights:
- unknown
73. Septuaginta Paolo Spunar oblata :
- Type:
- text and sborníky jubilejní
- Subject:
- Dějiny civilizace. Kulturní dějiny, Spunar, Pavel,, historici čeští, jubilea životní, and české (československé) sborníky a kolektivní monografie
- Language:
- Czech, German, French, English, Russian, and Polish
- Description:
- Vydalo nakl. KLP-Koniasch Latin Press ve spolupráci s Ústavem pro klasická studia AV ČR and Kromě běžného nákladu bylo vydáno 50 ručně vázaných výt.
- Rights:
- unknown
74. Slavia :
- Type:
- text
- Subject:
- Seriálové publikace. Periodika, slavistika, filologie slovanská, and česká periodika
- Language:
- Czech, Polish, Serbian, Russian, Bulgarian, Croatian, and Slovak
- Rights:
- unknown
75. Slavia :
- Type:
- text and časopisy
- Subject:
- Seriálové publikace. Periodika, slavistika, filologie slovanská, and česká periodika
- Language:
- Czech, Polish, Somrai, Russian, Bulgarian, Crow, Serrano, and Slovak
- Rights:
- unknown
76. Slavia :
- Type:
- text and časopisy
- Subject:
- Seriálové publikace. Periodika, slavistika, filologie slovanská, and česká periodika
- Language:
- Czech, Slovak, Polish, Russian, and Italian
- Rights:
- unknown
77. Slavia :
- Type:
- text and časopisy
- Subject:
- Seriálové publikace. Periodika, slavistika, filologie slovanská, and česká periodika
- Language:
- Czech, Polish, Somrai, Russian, Bulgarian, Crow, Serrano, and Slovak
- Description:
- Vydavatel: v nakladatelství Euroslavica
- Rights:
- unknown
78. Slavia. Obsah ročníku
- Type:
- model:supplement and TEXT
- Language:
- Czech, German, Polish, Russian, and Slovak
- Rights:
- http://creativecommons.org/publicdomain/mark/1.0/ and policy:public
79. Slavia. Obsah ročníku
- Type:
- model:supplement and TEXT
- Language:
- Czech, German, Polish, Russian, and Slovak
- Rights:
- http://creativecommons.org/publicdomain/mark/1.0/ and policy:public
80. Slavia. Obsah ročníku
- Type:
- model:supplement and TEXT
- Language:
- Czech, German, Polish, Russian, and Slovak
- Rights:
- http://creativecommons.org/publicdomain/mark/1.0/ and policy:public
81. Slavia. Obsah ročníku
- Type:
- model:supplement and TEXT
- Language:
- Czech, German, Polish, Russian, and Slovak
- Rights:
- http://creativecommons.org/publicdomain/mark/1.0/ and policy:public
82. Slavia. Obsah ročníku
- Type:
- model:supplement and TEXT
- Language:
- Czech, German, Polish, Russian, and Slovak
- Rights:
- http://creativecommons.org/publicdomain/mark/1.0/ and policy:public
83. Slavia. Obsah ročníku
- Type:
- model:supplement and TEXT
- Language:
- Czech, German, Polish, Russian, and Slovak
- Rights:
- http://creativecommons.org/publicdomain/mark/1.0/ and policy:public
84. Slavia. Obsah ročníku
- Type:
- model:supplement and TEXT
- Language:
- Czech, German, Polish, Russian, and Slovak
- Rights:
- http://creativecommons.org/publicdomain/mark/1.0/ and policy:public
85. Slavia. Obsah ročníku 89
- Type:
- model:supplement and TEXT
- Language:
- Czech, German, Polish, Russian, and Slovak
- Description:
- Obsah ročníku 89
- Rights:
- http://creativecommons.org/publicdomain/mark/1.0/ and policy:public
86. Slavia: časopis pro slovanskou filologii
- Type:
- model:periodicalitem and TEXT
- Language:
- Czech, German, Polish, Russian, and Slovak
- Description:
- 1
- Rights:
- http://creativecommons.org/publicdomain/mark/1.0/ and policy:public
87. Slavia: časopis pro slovanskou filologii
- Type:
- model:periodicalitem and TEXT
- Language:
- Czech, German, Polish, Russian, and Slovak
- Description:
- 1
- Rights:
- http://creativecommons.org/publicdomain/mark/1.0/ and policy:public
88. Slavia: časopis pro slovanskou filologii
- Type:
- model:periodicalitem and TEXT
- Language:
- Czech, German, Polish, Russian, and Slovak
- Description:
- 2
- Rights:
- http://creativecommons.org/publicdomain/mark/1.0/ and policy:public
89. Slavia: časopis pro slovanskou filologii
- Type:
- model:periodicalitem and TEXT
- Language:
- Czech, German, Polish, Russian, and Slovak
- Description:
- 3
- Rights:
- http://creativecommons.org/publicdomain/mark/1.0/ and policy:public
90. Slavia: časopis pro slovanskou filologii
- Type:
- model:periodicalitem and TEXT
- Language:
- Czech, German, Polish, Russian, and Slovak
- Description:
- 1
- Rights:
- http://creativecommons.org/publicdomain/mark/1.0/ and policy:public
91. Slavia: časopis pro slovanskou filologii
- Type:
- model:periodicalitem and TEXT
- Language:
- Czech, German, Polish, Russian, and Slovak
- Description:
- 3
- Rights:
- http://creativecommons.org/publicdomain/mark/1.0/ and policy:public
92. Slavia: časopis pro slovanskou filologii
- Type:
- model:periodicalitem and TEXT
- Language:
- Czech, German, Polish, Russian, and Slovak
- Description:
- 2
- Rights:
- http://creativecommons.org/publicdomain/mark/1.0/ and policy:public
93. Slavia: časopis pro slovanskou filologii
- Type:
- model:periodicalitem and TEXT
- Language:
- Czech, German, Polish, Russian, and Slovak
- Description:
- 1
- Rights:
- http://creativecommons.org/publicdomain/mark/1.0/ and policy:public
94. Slavia: časopis pro slovanskou filologii
- Type:
- model:periodicalitem and TEXT
- Language:
- Czech, German, Polish, Russian, and Slovak
- Description:
- 2
- Rights:
- http://creativecommons.org/publicdomain/mark/1.0/ and policy:public
95. Slavia: časopis pro slovanskou filologii
- Type:
- model:periodicalitem and TEXT
- Language:
- Czech, German, Polish, Russian, and Slovak
- Description:
- 2
- Rights:
- http://creativecommons.org/publicdomain/mark/1.0/ and policy:public
96. Slavia: časopis pro slovanskou filologii
- Type:
- model:periodicalitem and TEXT
- Language:
- Czech, German, Polish, Russian, and Slovak
- Description:
- 4
- Rights:
- http://creativecommons.org/publicdomain/mark/1.0/ and policy:public
97. Slavia: časopis pro slovanskou filologii
- Type:
- model:periodicalitem and TEXT
- Language:
- Czech, German, Polish, Russian, and Slovak
- Description:
- 4
- Rights:
- http://creativecommons.org/publicdomain/mark/1.0/ and policy:public
98. Slavia: časopis pro slovanskou filologii
- Type:
- model:periodicalitem and TEXT
- Language:
- Czech, German, Polish, Russian, and Slovak
- Description:
- 4
- Rights:
- http://creativecommons.org/publicdomain/mark/1.0/ and policy:public
99. Slavia: časopis pro slovanskou filologii
- Type:
- model:periodicalitem and TEXT
- Language:
- Czech, German, Polish, Russian, and Slovak
- Description:
- 1
- Rights:
- http://creativecommons.org/publicdomain/mark/1.0/ and policy:public
100. Slavia: časopis pro slovanskou filologii
- Type:
- model:periodicalitem and TEXT
- Language:
- Czech, German, Polish, Russian, and Slovak
- Description:
- 4
- Rights:
- http://creativecommons.org/publicdomain/mark/1.0/ and policy:public