Number of results to display per page
Search Results
22. Křižovatky Slovanů /
- Type:
- text and monografie kolektivní
- Subject:
- Filologie, slavistika, Slované, české (československé) sborníky a kolektivní monografie, české země od r. 1993 do současnosti, světové dějiny od r. 1945 do současnosti, and dějiny slavistiky
- Language:
- Czech, English, Croatian, Polish, Russian, Slovak, and Ukrainian
- Description:
- Konference Praha, 6-7.11.2014
- Rights:
- unknown
23. Matija Mesić i češki i slovački suvremenici :
- Creator:
- Šabić, Marijan,
- Type:
- text and monografie
- Subject:
- Historická věda. Pomocné vědy historické. Archivnictví, Biografie, Mesić, Matija,, korespondence, historici chorvatští, vztahy chorvatsko-české, vztahy slovensko-chorvatské, Chorvatsko, světové dějiny 1789-1918, dějepisectví, historické vědy, historici, české země 1792-1918, and Slovensko 1780-1918
- Language:
- Croatian, Czech, and Slovak
- Description:
- 400 výtisků
- Rights:
- unknown
24. Plaintext Wikipedia dump 2018
- Creator:
- Rosa, Rudolf
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- Wikipedia, text corpora, and monolingual corpus
- Language:
- Abkhazian, Achinese, Adyghe, Afrikaans, Akan, Tosk Albanian, Amharic, Old English (ca. 450-1100), Arabic, Official Aramaic (700-300 BCE), Aragonese, Egyptian Arabic, Assamese, Asturian, Atikamekw, Avaric, Aymara, South Azerbaijani, Azerbaijani, Bashkir, Bambara, Bavarian, Central Bikol, Belarusian, Bengali, Bislama, Banjar, Tibetan, Bosnian, Bishnupriya, Breton, Buginese, Bulgarian, Russia Buriat, Catalan, Min Dong Chinese, Cebuano, Czech, Chamorro, Chechen, Cherokee, Church Slavic, Chuvash, Cheyenne, Central Kurdish, Cornish, Corsican, Cree, Crimean Tatar, Kashubian, Welsh, Danish, German, Dinka, Dimli (individual language), Dhivehi, Lower Sorbian, Dzongkha, Modern Greek (1453-), English, Esperanto, Estonian, Basque, Ewe, Extremaduran, Faroese, Persian, Fijian, Finnish, French, Arpitan, Northern Frisian, Western Frisian, Fulah, Friulian, Gagauz, Gan Chinese, Scottish Gaelic, Irish, Galician, Gilaki, Manx, Goan Konkani, Gothic, Guarani, Gujarati, Hakka Chinese, Haitian, Hausa, Hawaiian, Serbo-Croatian, Hebrew, Herero, Fiji Hindi, Hindi, Hiri Motu, Croatian, Upper Sorbian, Hungarian, Armenian, Igbo, Ido, Inuktitut, Interlingue, Iloko, Interlingua (International Auxiliary Language Association), Indonesian, Inupiaq, Icelandic, Italian, Jamaican Creole English, Javanese, Lojban, Japanese, Kara-Kalpak, Kabyle, Kalaallisut, Kannada, Kashmiri, Georgian, Kanuri, Kazakh, Kabardian, Kabiyè, Khmer, Kikuyu, Kinyarwanda, Kirghiz, Komi-Permyak, Komi, Kongo, Korean, Karachay-Balkar, Kölsch, Kurdish, Ladino, Lao, Latin, Latvian, Lak, Lezghian, Ligurian, Limburgan, Lingala, Lithuanian, Lombard, Northern Luri, Latgalian, Luxembourgish, Ganda, Literary Chinese, Marshallese, Maithili, Malayalam, Marathi, Moksha, Eastern Mari, Minangkabau, Macedonian, Malagasy, Maltese, Mongolian, Maori, Western Mari, Malay (macrolanguage), Creek, Mirandese, Burmese, Erzya, Mazanderani, Min Nan Chinese, Neapolitan, Nauru, Navajo, Ndonga, Low German, Nepali (macrolanguage), Newari, Dutch, Norwegian Nynorsk, Norwegian, Novial, Pedi, Nyanja, Occitan (post 1500), Livvi, Oriya (macrolanguage), Oromo, Ossetian, Pangasinan, Pampanga, Panjabi, Papiamento, Picard, Pennsylvania German, Pfaelzisch, Pitcairn-Norfolk, Pali, Piemontese, Western Panjabi, Pontic, Polish, Portuguese, Pushto, Quechua, Vlax Romani, Romansh, Romanian, Rusyn, Rundi, Macedo-Romanian, Russian, Sango, Yakut, Sanskrit, Sicilian, Scots, Samogitian, Sinhala, Slovak, Slovenian, Northern Sami, Samoan, Shona, Sindhi, Somali, Southern Sotho, Spanish, Albanian, Sardinian, Sranan Tongo, Serbian, Swati, Saterfriesisch, Sundanese, Swahili (macrolanguage), Swedish, Silesian, Tahitian, Tamil, Tatar, Tulu, Telugu, Tama (Colombia), Tetum, Tajik, Tagalog, Thai, Tigrinya, Tonga (Tonga Islands), Tok Pisin, Tswana, Tsonga, Turkmen, Tumbuka, Turkish, Twi, Tuvinian, Udmurt, Uighur, Ukrainian, Urdu, Uzbek, Venetian, Venda, Veps, Vietnamese, Vlaams, Volapük, Võro, Waray (Philippines), Walloon, Wolof, Wu Chinese, Kalmyk, Xhosa, Mingrelian, Yiddish, Yoruba, Yue Chinese, Zeeuws, Zhuang, Chinese, Zulu, and Dotyali
- Description:
- Wikipedia plain text data obtained from Wikipedia dumps with WikiExtractor in February 2018. The data come from all Wikipedias for which dumps could be downloaded at [https://dumps.wikimedia.org/]. This amounts to 297 Wikipedias, usually corresponding to individual languages and identified by their ISO codes. Several special Wikipedias are included, most notably "simple" (Simple English Wikipedia) and "incubator" (tiny hatching Wikipedias in various languages). For a list of all the Wikipedias, see [https://meta.wikimedia.org/wiki/List_of_Wikipedias]. The script which can be used to get new version of the data is included, but note that Wikipedia limits the download speed for downloading a lot of the dumps, so it takes a few days to download all of them (but one or a few can be downloaded fast). Also, the format of the dumps changes time to time, so the script will probably eventually stop working one day. The WikiExtractor tool [http://medialab.di.unipi.it/wiki/Wikipedia_Extractor] used to extract text from the Wikipedia dumps is not mine, I only modified it slightly to produce plaintext outputs [https://github.com/ptakopysk/wikiextractor].
- Rights:
- Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0), http://creativecommons.org/licenses/by-sa/3.0/, and PUB
25. Prolínání slovanských prostředí /
- Type:
- text and sborníky konferenční
- Subject:
- Filologie, slavistika, and české (československé) sborníky a kolektivní monografie
- Language:
- Czech, Belarusian, Croatian, Macedonian, Polish, Russian, Slovak, and Ukrainian
- Description:
- Sborník studií vycházejících z referátů 7. roč. Konference mladých slavistů konané v r. 2011 na FF UK v Praze and Rok konání konference uveden na hřbetu pub.
- Rights:
- unknown
26. Prolínání slovanských prostředí /
- Type:
- text and sborníky konferenční
- Subject:
- Filologie, slavistika, and české (československé) sborníky a kolektivní monografie
- Language:
- Czech, Belarusian, Croatian, Macedonian, Polish, Russian, Slovak, and Ukrainian
- Description:
- Sborník studií vycházejících z referátů 7. roč. Konference mladých slavistů konané v r. 2011 na FF UK v Praze and Rok konání konference uveden na hřbetu pub.
- Rights:
- unknown
27. Slavia :
- Type:
- text
- Subject:
- Seriálové publikace. Periodika, slavistika, filologie slovanská, and česká periodika
- Language:
- Czech, Polish, Serbian, Russian, Bulgarian, Croatian, and Slovak
- Rights:
- unknown
28. Slavic Forest, Norwegian Wood (models)
- Creator:
- Rosa, Rudolf, Zeman, Daniel, Mareček, David, and Žabokrtský, Zdeněk
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- other and toolService
- Subject:
- parsing, dependency parser, cross-lingual parsing, and universal dependencies
- Language:
- Slovak, Croatian, and Norwegian
- Description:
- Trained models for UDPipe used to produce our final submission to the Vardial 2017 CLP shared task (https://bitbucket.org/hy-crossNLP/vardial2017). The SK model was trained on CS data, the HR model on SL data, and the SV model on a concatenation of DA and NO data. The scripts and commands used to create the models are part of separate submission (http://hdl.handle.net/11234/1-1970). The models were trained with UDPipe version 3e65d69 from 3rd Jan 2017, obtained from https://github.com/ufal/udpipe -- their functionality with newer or older versions of UDPipe is not guaranteed. We list here the Bash command sequences that can be used to reproduce our results submitted to VarDial 2017. The input files must be in CoNLLU format. The models only use the form, UPOS, and Universal Features fields (SK only uses the form). You must have UDPipe installed. The feats2FEAT.py script, which prunes the universal features, is bundled with this submission. SK -- tag and parse with the model: udpipe --tag --parse sk-translex.v2.norm.feats07.w2v.trainonpred.udpipe sk-ud-predPoS-test.conllu A slightly better after-deadline model (sk-translex.v2.norm.Case-feats07.w2v.trainonpred.udpipe), which we mention in the accompanying paper, is also included. It is applied in the same way (udpipe --tag --parse sk-translex.v2.norm.Case-feats07.w2v.trainonpred.udpipe sk-ud-predPoS-test.conllu). HR -- prune the Features to keep only Case and parse with the model: python3 feats2FEAT.py Case < hr-ud-predPoS-test.conllu | udpipe --parse hr-translex.v2.norm.Case.w2v.trainonpred.udpipe NO -- put the UPOS annotation aside, tag Features with the model, merge with the left-aside UPOS annotation, and parse with the model (this hassle is because UDPipe cannot be told to keep UPOS and only change Features): cut -f1-4 no-ud-predPoS-test.conllu > tmp udpipe --tag no-translex.v2.norm.tgttagupos.srctagfeats.Case.w2v.udpipe no-ud-predPoS-test.conllu | cut -f5- | paste tmp - | sed 's/^\t$//' | udpipe --parse no-translex.v2.norm.tgttagupos.srctagfeats.Case.w2v.udpipe
- Rights:
- Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB
29. Slavic Forest, Norwegian Wood (scripts)
- Creator:
- Rosa, Rudolf, Zeman, Daniel, Mareček, David, and Žabokrtský, Zdeněk
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- suiteOfTools and toolService
- Subject:
- parsing, dependency parser, universal dependencies, and cross-lingual parsing
- Language:
- Czech, Slovak, Slovenian, Croatian, Danish, Swedish, and Norwegian
- Description:
- Tools and scripts used to create the cross-lingual parsing models submitted to VarDial 2017 shared task (https://bitbucket.org/hy-crossNLP/vardial2017), as described in the linked paper. The trained UDPipe models themselves are published in a separate submission (https://lindat.mff.cuni.cz/repository/xmlui/handle/11234/1-1971). For each source (SS, e.g. sl) and target (TT, e.g. hr) language, you need to add the following into this directory: - treebanks (Universal Dependencies v1.4): SS-ud-train.conllu TT-ud-predPoS-dev.conllu - parallel data (OpenSubtitles from Opus): OpenSubtitles2016.SS-TT.SS OpenSubtitles2016.SS-TT.TT !!! If they are originally called ...TT-SS... instead of ...SS-TT..., you need to symlink them (or move, or copy) !!! - target tagging model TT.tagger.udpipe All of these can be obtained from https://bitbucket.org/hy-crossNLP/vardial2017 You also need to have: - Bash - Perl 5 - Python 3 - word2vec (https://code.google.com/archive/p/word2vec/); we used rev 41 from 15th Sep 2014 - udpipe (https://github.com/ufal/udpipe); we used commit 3e65d69 from 3rd Jan 2017 - Treex (https://github.com/ufal/treex); we used commit d27ee8a from 21st Dec 2016 The most basic setup is the sl-hr one (train_sl-hr.sh): - normalization of deprels - 1:1 word-alignment of parallel data with Monolingual Greedy Aligner - simple word-by-word translation of source treebank - pre-training of target word embeddings - simplification of morpho feats (use only Case) - and finally, training and evaluating the parser Both da+sv-no (train_ds-no.sh) and cs-sk (train_cs-sk.sh) add some cross-tagging, which seems to be useful only in specific cases (see paper for details). Moreover, cs-sk also adds more morpho features, selecting those that seem to be very often shared in parallel data. The whole pipeline takes tens of hours to run, and uses several GB of RAM, so make sure to use a powerful computer.
- Rights:
- GNU General Public License 2 or later (GPL-2.0), http://opensource.org/licenses/GPL-2.0, and PUB
30. Střední, jižní a východní Evropa a rok 1918 :
- Type:
- text, monografie kolektivní, and sborníky konferenční
- Subject:
- Dějiny Evropy, válka první světová (1914-1918), dějiny kultury, slavistika historická, české (československé) sborníky a kolektivní monografie, světové dějiny 1918-1945, and dějiny vědy, umění, kultury a techniky, kulturní vztahy
- Language:
- Czech, Bulgarian, Croatian, Polish, Russian, Slovak, and Ukrainian
- Description:
- Na hřbetu označení: 2018 and Texty vycházejí z referátů přednesených na 14. ročníku Konference mladých slavistů konané ve dnech 1. a 2. listopadu 2018
- Rights:
- unknown