Language: Czech / Subject: Czech - LINDAT/CLARIAH-CZ Catalog Search Results

Start Over Language Czech Subject Czech

1. Češi a slovenština

Creator:: Tom Dickins
Format:: print, bez média, and svazek
Type:: article, články, journal articles, model:article, and TEXT
Subject:: Společenské vědy, výzkum veřejného mínění, public opinion polls, Czech, Slovak, perceptions, attitudes, bilingualism, semi-communication, dialects, 18, and 3
Language:: Czech
Description:: Tento článek používá empirická data za účelem kontextualizace a shrnutí postojů Čechů ke slovenštině a jejich představ o znalosti slovenštiny. Klade si dále za cíl osvětlit změny, které nastaly po roce 1989, a přispět v obecnějším smyslu k existujícím poznatkům o česko-slovenských jazykových vztazích. Zároveň také usiluje o vyzdvižení obtížnosti při vymezení statutu dvou zeměpisně přilehlých kontaktních jazyků, jejichž identitu mluvčí definují ve stejné míře pomocí sdílené politické a historické zkušenosti (zejména ve dvacátém století) a jejich etnických, kulturních a jazykových rozdílů. Evidence je primárně shromážděna ze dvou celonárodních výzkumů, provedených pro autora v Centru pro výzkum veřejného mínění Sociologického ústavu AV ČR, v.v.i.: „Postoje českých mluvčích k lexikálním výpůjčkám“ (dále jen „Postoje“) a „Češi a slovenština“. Obsah a metodologie těchto výzkumů jsou založeny na různé řadě diachronních a synchronních dat, zejména pak studie z roku 1971 v Institutu pro výzkum veřejného mínění (předchůdce CVVM), a tří rozsáhlých průzkumů Evropské unie., This study employs a r ange of up-to-date statistical information, including the findings of two nationwide sur- veys conducted on the author’s behalf, to evaluate current perceptions of Slovak in the Czech Republic. Where appropriate, the results are compared with the evidence of other questionnaires (including Tejnor: 1971)., and Tom Dickins.
Rights:: http://creativecommons.org/licenses/by-nc-sa/4.0/ and policy:public

2. CoCzeFLA Chroma 2023.04

Creator:: Chromá, Anna, Matiasovitsová, Klára, Sláma, Jakub, and Treichelová, Jolana
Publisher:: Charles University, Faculty of Arts
Type:: text and corpus
Subject:: first language acquisition, typical development, longitudinal corpus, and Czech
Language:: Czech
Description:: A new version of the previously published corpus Chroma. The version 2023.04 includes six children. Two transcripts (Julie20221, Klara30424) were removed since they did not meet the criteria on the dialogical format. The transcripts were revised (eliminating typing errors and inconsistencies in the transcription format) and morphologically annotated by the automatic tool MorphoDiTa. Detailed manual control of the annotation was performed on children's utterances; the annotation of adult data was not checked yet. Files are in plain text with UTF-8 encoding. Each file represents one recording session of one of the target children and is named with the alias of the child and their age at the given session in form YMMDD. Transcription rules and other details can be found on the homepage coczefla.ff.cuni.cz.
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

3. CoCzeFLA Chroma 2023.07

Creator:: Chromá, Anna, Sláma, Jakub, Matiasovitsová, Klára, and Kohoutková, Jolana
Publisher:: Charles University, Faculty of Arts
Type:: text and corpus
Subject:: first language acquisition, typical development, longitudinal corpus, and Czech
Language:: Czech
Description:: A new version of the previously published corpus Chroma wih morphological annotation. The version 2023.07 differs from 2023.04 in that it includes all seven children and it went through an additional careful check of consistency and conformity to the CHAT transcription principles. Two transcripts (Julie20221, Klara30424) from the previous versions (2022.07, 2019.07) were removed since they did not meet our criteria on dialogical format. All transcripts of recordings made during one day were split into one file. Thus, version 2023.07 consists of 183 files/transcripts. The number of utterances and tokens given here in LINDAT corresponds to children's lines only. Files are in plain text with UTF-8 encoding. Each file represents one recording session of one of the target children and is named with the alias of the child and their age at the given session in form YMMDD. Transcription rules and other details can be found on the homepage coczefla.ff.cuni.cz.
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

4. CoNLL-based Extended Czech Named Entity Corpus 1.0

Creator:: Konkol, Michal, Konopík, Miloslav, Ševčíková, Magda, Žabokrtský, Zdeněk, and Straková, Jana
Publisher:: University of West Bohemia
Type:: text and corpus
Subject:: named entity recognition, Czech, and conll
Language:: Czech
Description:: This is a Czech Named Entity Corpus 1.0 transformed into the CoNLL format. The original corpus can be downloaded from: http://hdl.handle.net/11858/00-097C-0000-0023-1B04-C. The CoNLL transformation is described in this publication: https://link.springer.com/chapter/10.1007/978-3-642-40585-3_20.
Rights:: Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0), http://creativecommons.org/licenses/by-nc-sa/3.0/, and PUB

5. CoNLL-based Extended Czech Named Entity Corpus 2.0

Creator:: Konkol, Michal, Konopík, Miloslav, Ševčíková, Magda, Žabokrtský, Zdeněk, Straková, Jana, and Straka, Milan
Publisher:: University of West Bohemia
Type:: text and corpus
Subject:: named entity recognition and Czech
Language:: Czech
Description:: This is a Czech Named Entity Corpus 2.0 transformed into the CoNLL format. The original corpus can be downloaded from: http://hdl.handle.net/11858/00-097C-0000-0023-1B22-8. The CoNLL transformation is described in this publication: https://link.springer.com/chapter/10.1007/978-3-642-40585-3_20.
Rights:: Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0), http://creativecommons.org/licenses/by-nc-sa/3.0/, and PUB

6. CWC2011

Creator:: Spoustová, Johanka and Spousta, Miroslav
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: corpus, Czech, and web
Language:: Czech
Description:: Web corpus of Czech, created in 2011. Contains newspapers+magazines, discussions, blogs. See http://www.lrec-conf.org/proceedings/lrec2012/summaries/120.html for details. and GA405/09/0278
Rights:: Creative Commons - Attribution 3.0 Unported (CC BY 3.0), http://creativecommons.org/licenses/by/3.0/, and PUB

7. Czech HS Contracts Dataset (CHSC) 1.0

Creator:: Szabó, Adam and Straka, Milan
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: Czech, document classification, contracts, and Hlídač státu
Language:: Czech
Description:: Czech Contracts dataset was created as a part of the thesis Low-resource Text Classification (2021), A. Szabó, MFF UK. Contracts are obtained from the Hlídač Státu web portal. Labels in the development and training set are automatically classified on the basis of the keyword method according to the thesis Automatická klasifikace smluv pro portál HlidacSmluv.cz, J. Maroušek (2020), MFF UK. For this reason, the goal in the classification is not to achieve 100% on the development set, as the classification contains a certain amount of noise. The test set is manually annotated. The dataset contains a total of 97493 contracts.
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), PUB, and http://creativecommons.org/licenses/by-nc-sa/4.0/

8. Czech Legal Text Treebank

Creator:: Kríž, Vincent, Hladká, Barbora, and Urešová, Zdeňka
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: treebank, corpus, Czech, legal texts, and legal domain
Language:: Czech
Description:: The Czech Legal Text Treebank (CLTT) is a collection of 1133 manually annotated dependency trees. CLTT consists of two legal documents: The Accounting Act (563/1991 Coll., as amended) and Decree on Double-entry Accounting for undertakers (500/2002 Coll., as amended).
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

9. Czech Lexico-Semantic Database 0.1

Creator:: Tichy, Ondrej, Obstova, Zora, and Klegr, Ales
Publisher:: Charles University, Faculty of Arts
Type:: text, thesaurus, and lexicalConceptualResource
Subject:: onomasiological lexicography, thesaurus, lexico-semantic database, digitization, and Czech
Language:: Czech
Description:: A lexicographical project, whose aim is to digitize and align two Czech onomasiological dictionaries (Haller 1969–77; Klégr 2007) in order to create an integrated digital multi-purpose lexico-semantic database of Czech.
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

10. Czech Models (CNEC) for NameTag

Creator:: Straka, Milan and Straková, Jana
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text, languageDescription, and mlmodel
Subject:: NameTag, Czech, and named entity recognition
Language:: Czech
Description:: Czech models for NameTag, providing recognition of named entities. The models are trained on Czech Named Entity Corpus 2.0 and 1.1. and This work has been using language resources developed and/or stored and/or distributed by the LINDAT/CLARIN project of the Ministry of Education of the Czech Republic (project LM2010013). Czech models are trained on Czech Named Entity Corpus, which was created by Magda Ševčíková, Zdeněk Žabokrtský, Jana Straková and Milan Straka. The recognizer research was supported by the projects MSM0021620838 and LC536 of Ministry of Education, Youth and Sports of the Czech Republic, 1ET101120503 of Academy of Sciences of the Czech Republic, LINDAT/CLARIN project of the Ministry of Education of the Czech Republic (project LM2010013), and partially by SVV project number 267 314. The research was performed by Jana Straková, Zdeněk Žabokrtský and Milan Straka. Czech models use MorphoDiTa as a tagger and lemmatizer, therefore MorphoDiTa Acknowledgements (http://ufal.mff.cuni.cz/morphodita#morphodita_acknowledgements) and Czech MorphoDiTa Model Acknowledgements (http://ufal.mff.cuni.cz/morphodita/users-manual#czech-morfflex-pdt_acknowledgements) apply.
Rights:: Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0), http://creativecommons.org/licenses/by-nc-sa/3.0/, and PUB

1. Češi a slovenština

2. CoCzeFLA Chroma 2023.04

3. CoCzeFLA Chroma 2023.07

4. CoNLL-based Extended Czech Named Entity Corpus 1.0

5. CoNLL-based Extended Czech Named Entity Corpus 2.0

6. CWC2011

7. Czech HS Contracts Dataset (CHSC) 1.0

8. Czech Legal Text Treebank

9. Czech Lexico-Semantic Database 0.1

10. Czech Models (CNEC) for NameTag

Limit your search

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Search

Search Constraints

Search Results

Limit your search

Contributor

Show values starting with

Coverage

Creator

Show values starting with

Format

Language

Publisher

Show values starting with

Rights

Show values starting with

Subject

Show values starting with

Type

Show values starting with

Date

Original context has metadata only

Harvested from