Subject: Czech - LINDAT/CLARIAH-CZ Catalog Search Results

Start Over Subject Czech Date Unknown

21. Khresmoi Query Translation Test Data 2.0

Creator:: Pecina, Pavel, Dušek, Ondřej, Hajič, Jan, Libovický, Jindřich, and Urešová, Zdeňka
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: corpus, test data, medical, health, machine translation, Czech, English, French, German, Hungarian, Polish, Spanish, and Swedish
Language:: Czech, English, French, German, Hungarian, Polish, Spanish, and Swedish
Description:: This package contains data sets for development and testing of machine translation of medical queries between Czech, English, French, German, Hungarian, Polish, Spanish ans Swedish. The queries come from general public and medical experts. This is version 2.0 extending the previous version by adding Hungarian, Polish, Spanish, and Swedish translations.
Rights:: Creative Commons - Attribution-NonCommercial 4.0 International (CC BY-NC 4.0), http://creativecommons.org/licenses/by-nc/4.0/, and PUB

22. Khresmoi Summary Translation Test Data 1.1

Creator:: Dušek, Ondřej, Hajič, Jan, Hlaváčová, Jaroslava, Pecina, Pavel, Tamchyna, Aleš, and Urešová, Zdeňka
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: corpus, test data, medical, health, machine translation, Czech, French, German, and English
Language:: English, Czech, French, and German
Description:: This package contains data sets for development and testing of machine translation of sentences from summaries of medical articles between Czech, English, French, and German. and This work was supported by the EU FP7 project Khresmoi (European Comission contract No. 257528). The language resources are distributed by the LINDAT/Clarin project of the Ministry of Education, Youth and Sports of the Czech Republic (project no. LM2010013). We thank all the data providers and copyright holders for providing the source data and anonymous experts for translating the sentences.
Rights:: Attribution-NonCommercial 3.0 Unported (CC BY-NC 3.0), http://creativecommons.org/licenses/by-nc/3.0/, and PUB

23. Khresmoi Summary Translation Test Data 2.0

Creator:: Dušek, Ondřej, Hajič, Jan, Hlaváčová, Jaroslava, Libovický, Jindřich, Pecina, Pavel, Tamchyna, Aleš, and Urešová, Zdeňka
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: corpus, test data, medical, health, machine translation, Czech, English, French, German, Hungarian, Polish, Spanish, and Swedish
Language:: Czech, English, French, German, Hungarian, Polish, Spanish, and Swedish
Description:: This package contains data sets for development (Section dev) and testing (Section test) of machine translation of sentences from summaries of medical articles between Czech, English, French, German, Hungarian, Polish, Spanish and Swedish. Version 2.0 extends the previous version by adding Hungarian, Polish, Spanish, and Swedish translations.
Rights:: Creative Commons - Attribution-NonCommercial 4.0 International (CC BY-NC 4.0), http://creativecommons.org/licenses/by-nc/4.0/, and PUB

24. Large Corpus of Czech Parliament Plenary Hearings

Creator:: Kratochvíl, Jonáš, Polák, Peter, and Bojar, Ondřej
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: audio and corpus
Subject:: ASR and Czech
Language:: Czech
Description:: We present a large corpus of Czech parliament plenary sessions. The corpus consists of approximately 444 hours of speech data and corresponding text transcriptions. The whole corpus has been segmented to short audio snippets making it suitable for both training and evaluation of automatic speech recognition (ASR) systems. The source language of the corpus is Czech, which makes it a valuable resource for future research as only a few public datasets are available for the Czech language.
Rights:: Creative Commons - Attribution 4.0 International (CC BY 4.0), http://creativecommons.org/licenses/by/4.0/, and PUB

25. MorfFlex CZ 160310

Creator:: Hajič, Jan and Hlaváčová, Jaroslava
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text, lexicalConceptualResource, and computationalLexicon
Subject:: morphological dictionary, morphology, and Czech
Language:: Czech
Description:: Czech morphological dictionary developed originally by Jan Hajič as a spelling checker and lemmatization dictionary. Currently it contains full morphological information for each covered wordform, as well as some derivational, semantic and named entity information.
Rights:: Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0), http://creativecommons.org/licenses/by-nc-sa/3.0/, and PUB

26. MorfFlex CZ 161115

Creator:: Hajič, Jan and Hlaváčová, Jaroslava
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text, lexicalConceptualResource, and computationalLexicon
Subject:: morphological dictionary, morphology, and Czech
Language:: Czech
Description:: Czech morphological dictionary developed originally by Jan Hajič as a spelling checker and lemmatization dictionary. Currently it contains full morphological information for each covered wordform, as well as some derivational, semantic and named entity information.
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

27. MorfFlex CZ 2.0

Creator:: Hajič, Jan, Hlaváčová, Jaroslava, Mikulová, Marie, Straka, Milan, and Štěpánková, Barbora
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text, lexicalConceptualResource, and computationalLexicon
Subject:: morphological dictionary, morphology, and Czech
Language:: Czech
Description:: MorfFlex CZ 2.0 is the Czech morphological dictionary developed originally by Jan Hajič as a spelling checker and lemmatization dictionary. MorfFlex is a flat list of lemma-tag-wordform triples. For each wordform, full inflectional information is coded in a positional tag. Wordforms are organized into entries (paradigm instances or paradigms in short) according to their formal morphological behavior. The paradigm (set of wordforms) is identified by a unique lemma. Apart from traditional morphological categories, the description also contains some semantic, stylistic and derivational information. For more details see a comprehensive specification of the Czech morphological annotation http://ufal.mff.cuni.cz/techrep/tr64.pdf .
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

28. Multiple wh-fronting in Czech and wh-in-situ in Korean

Creator:: Inchon, Kim
Format:: bez média and svazek
Type:: model:article and TEXT
Subject:: Czech and Korean
Language:: English
Description:: This paper examines the significant phenomena of multiple wh-fronting (MWF) in Czech and wh-in-situ in Korean. It explores the general question of that kinds of syntactic mechanisms can give rise to the analysis of MWF in Czech as well as what licenses the formation of wh-questions in Korean. In particular, I focus on the question of where Czech wh-items can be projected within syntactic trees with respect to elitic placement in second position. For the question of what licenses the formation of wh-questions in Korean,. I propose the sentence-final markers (henceforth SFMs) to determine the scope of wh-operators, without any appparent wh-movement. Additionally, I argue that the non-overt movement of wh-items in Korean can be also supported by ambigous reading.
Rights:: http://creativecommons.org/publicdomain/mark/1.0/ and policy:public

29. Neuhochdeutsche Lehnwörter im Tschechischen: eine Frequenzuntersuchung zwecks der Tendenzermittlung

Creator:: Tikhonov, Aleksej
Format:: bez média and svazek
Type:: model:article and TEXT
Subject:: germanisms, synchrony, diachrony, corpus, Czech, German, frequency, loanwords, ermanizmy, synchronie, diachronie, korpus, čeština, němčina, frekvence, and přejatá slova
Language:: Czech
Description:: This article deals with germanisms in Czech. Frequencies of 26 different new High German loanwords were analyzed in the Czech National Corpus. These borrowed words were standing in competition with their Czech synonyms. This comparison is used to study the question of whether germanisms or their equivalents in Czech are more used by native speakers. For this analysis new High German loanwords were deliberately selected in order to verify the actuality of the topic. But the major part of the study was examined in a diachronic period. This shows not only the current situation but in most cases the frequency of the selected loanwords throughout their existence. The calculations of the average frequency are made for each century (since 1650), and also in the recent modern period (from 1947 to 2008). and Článek se zabývá germanizmy v češtině. Prostřednictvím Českého národního korpusu byly zjišťovány různé frekvence 26 novohornoněmeckých výpůjček a jim konkurujících českých synonym. Článek se na základě frekvenčních srovnání snaží odpovědět na otázku, zda čeští rodilí mluvčí preferují germanizmy či dávají přednost jejich českým ekvivalentům. Článek analyzuje nejen aktuální situaci, ale ve většině případů ukazuje frekvenci vybraných germanizmů z diachronního hlediska, po celou dobu jejich existence. Byla vypočtena průměrná frekvence za každé století (od roku 1650), včetně posledního moderního období (od roku 1947 do roku 2008).
Rights:: http://creativecommons.org/publicdomain/mark/1.0/ and policy:public

30. NomVallex 2.0

Creator:: Kolářová, Veronika, Vernerová, Anna, and Klímová, Jana
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text, machineReadableDictionary, and lexicalConceptualResource
Subject:: valency, Czech, lexicon, syntax, semantics, deverbal nouns, adjectival valency, deadjectival nouns, deverbal adjectives, primary adjectives, denominal adjectives, deadjectival adjectives, and noun valency
Language:: Czech
Description:: NomVallex 2.0 is a manually annotated valency lexicon of Czech nouns and adjectives, created in the theoretical framework of the Functional Generative Description and based on corpus data (the SYN series of corpora from the Czech National Corpus and the Araneum Bohemicum Maximum corpus). In total, NomVallex is comprised of 1027 lexical units contained in 570 lexemes, covering the following parts-of-speech and derivational categories: deverbal or deadjectival nouns, and deverbal, denominal, deadjectival or primary adjectives. Valency properties of a lexical unit are captured in a valency frame (modeled as a sequence of valency slots, each supplemented with a list of morphemic forms) and documented by corpus examples. In order to make it possible to study the relationship between valency behavior of base words and their derivatives, lexical units of nouns and adjectives in NomVallex are linked to their respective base lexical units (contained either in NomVallex itself or, in case of verbs, in the VALLEX lexicon), linking up to three parts-of-speech (i.e., noun – verb, adjective – verb, noun – adjective, and noun – adjective – verb). In order to facilitate comparison, this submission also contains abbreviated entries of the base verbs of these nouns and adjectives from the VALLEX lexicon and simplified entries of the covered nouns and adjectives from the PDT-Vallex lexicon.
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

21. Khresmoi Query Translation Test Data 2.0

22. Khresmoi Summary Translation Test Data 1.1

23. Khresmoi Summary Translation Test Data 2.0

24. Large Corpus of Czech Parliament Plenary Hearings

25. MorfFlex CZ 160310

26. MorfFlex CZ 161115

27. MorfFlex CZ 2.0

28. Multiple wh-fronting in Czech and wh-in-situ in Korean

29. Neuhochdeutsche Lehnwörter im Tschechischen: eine Frequenzuntersuchung zwecks der Tendenzermittlung

30. NomVallex 2.0

Limit your search

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Search

Search Constraints

Search Results

Limit your search

Contributor

Show values starting with

Coverage

Creator

Show values starting with

Format

Language

Publisher

Rights

Show values starting with

Subject

Show values starting with

Type

Show values starting with

Original context has metadata only

Harvested from