Creator: Straka, Milan and Žabokrtský, Zdeněk / Harvested from: LINDAT/CLARIAH-CZ repository / Rights: http://creativecommons.org/licenses/by-nc-sa/3.0/

Start Over Creator Straka, Milan Creator Žabokrtský, Zdeněk Rights http://creativecommons.org/licenses/by-nc-sa/3.0/ Harvested from LINDAT/CLARIAH-CZ repository

1. CoNLL-based Extended Czech Named Entity Corpus 2.0

Creator:: Konkol, Michal, Konopík, Miloslav, Ševčíková, Magda, Žabokrtský, Zdeněk, Straková, Jana, and Straka, Milan
Publisher:: University of West Bohemia
Type:: text and corpus
Subject:: named entity recognition and Czech
Language:: Czech
Description:: This is a Czech Named Entity Corpus 2.0 transformed into the CoNLL format. The original corpus can be downloaded from: http://hdl.handle.net/11858/00-097C-0000-0023-1B22-8. The CoNLL transformation is described in this publication: https://link.springer.com/chapter/10.1007/978-3-642-40585-3_20.
Rights:: Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0), http://creativecommons.org/licenses/by-nc-sa/3.0/, and PUB

2. Czech Named Entity Corpus 1.1

Creator:: Ševčíková, Magda, Žabokrtský, Zdeněk, Straková, Jana, and Straka, Milan
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: named entity recognition and corpus
Language:: Czech
Description:: Czech Named Entity Corpus 1.1 fixes some issues of the Czech Named Entity Corpus 1.0: misannotated entities are fixed, all formats contain the same data, tmt format is replaced with treex format, all formats contain splitting into training, development and testing portion of the data. and SVV 267 314 (Teoretické základy informatiky a výpočetní lingvistiky), LM2010013 (LINDAT-CLARIN: Institut pro analýzu, zpracování a distribuci lingvistických dat), GPP406/12/P175 (Vybrané derivační vztahy pro automatické zpracování češtiny), PRVOUK (PRVOUK)
Rights:: Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0), http://creativecommons.org/licenses/by-nc-sa/3.0/, and PUB

3. Czech Named Entity Corpus 2.0

Creator:: Ševčíková, Magda, Žabokrtský, Zdeněk, Straková, Jana, and Straka, Milan
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: named entity recognition
Language:: Czech
Description:: Czech Named Entity Corpus 2.0 is a corpus of 8993 Czech sentences with manually annotated 35220 Czech named entities, classified according to a two-level hierarchy of 46 named entities. and SVV 267 314 (Teoretické základy informatiky a výpočetní lingvistiky), LM2010013 (LINDAT-CLARIN: Institut pro analýzu, zpracování a distribuci lingvistických dat), GPP406/12/P175 (Vybrané derivační vztahy pro automatické zpracování češtiny), PRVOUK (PRVOUK)
Rights:: Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0), http://creativecommons.org/licenses/by-nc-sa/3.0/, and PUB

4. DeriNet 1.0

Creator:: Vidra, Jonáš, Žabokrtský, Zdeněk, Ševčíková, Magda, and Straka, Milan
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text, wordnet, and lexicalConceptualResource
Subject:: derivation, DeriNet, lexical network, and MorfFlex
Language:: Czech
Description:: DeriNet is a lexical network which contains derivational relations in Czech modeled as an oriented graph. Nodes correspond to Czech lexemes (a lexeme is a single lemma, possibly with only a subset of its senses – homonyms may have different derivations and are thus represented by several lexemes) and edges represent derivations between them. DeriNet 1.0 contains 968,967 lexemes with 965,535 unique lemmas; connected by 715,729 derivational links. Lexemes in DeriNet 1.0 are sampled from the MorfFlex dictionary.
Rights:: Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0), http://creativecommons.org/licenses/by-nc-sa/3.0/, and PUB

5. DeriNet 1.2

Creator:: Vidra, Jonáš, Žabokrtský, Zdeněk, Ševčíková, Magda, and Straka, Milan
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text, wordnet, and lexicalConceptualResource
Subject:: derivation, DeriNet, lexical network, and MorfFlex
Language:: Czech
Description:: DeriNet is a lexical network which models derivational relations in the lexicon of Czech. Nodes of the network correspond to Czech lexemes (i.e. single lemmas, possibly with only a subset of their senses), edges represent derivational relations between a derived word and its base word. The present version, DeriNet 1.2, contains 1,003,590 lexemes (sampled from the MorfFlex dictionary) with 1,001,394 unique lemmas, connected by 740,750 derivational links. Both rather technical and linguistic changes were made as compared to the previous version of the data; e.g. new version of the MorfFlex dictionary was used, derived words that contain a consonant and/or vowel alternation (e.g. boží) were connected with their base word (bůh).
Rights:: Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0), http://creativecommons.org/licenses/by-nc-sa/3.0/, and PUB

6. DeriNet 1.5

Creator:: Vidra, Jonáš, Žabokrtský, Zdeněk, Ševčíková, Magda, Kalužová, Adéla, Mediankin, Nikita, and Straka, Milan
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text, wordnet, and lexicalConceptualResource
Subject:: DeriNet, derivation, derivational morphology, lexical network, and MorfFlex
Language:: Czech
Description:: DeriNet is a lexical network which models derivational relations in the lexicon of Czech. Nodes of the network correspond to Czech lexemes, while edges represent derivational relations between a derived word and its base word. The present version, DeriNet 1.5, contains 1,011,965 lexemes (sampled from the MorfFlex dictionary) connected by 785,543 derivational links. Besides several rather conservative updates (such as newly identified prefix and suffix verb-to-verb derivations as well as noun-to-adjective derivations manifested by most frequent adjectival suffixes), DeriNet 1.5 is the first version that contains annotations related to compounding (compound words are distinguished by a special mark in their part-of-speech labels).
Rights:: Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0), http://creativecommons.org/licenses/by-nc-sa/3.0/, and PUB

1. CoNLL-based Extended Czech Named Entity Corpus 2.0

2. Czech Named Entity Corpus 1.1

3. Czech Named Entity Corpus 2.0

4. DeriNet 1.0

5. DeriNet 1.2

6. DeriNet 1.5

Limit your search

Show values starting with

Search

Search Constraints

Search Results

Limit your search

Contributor

Show values starting with

Creator

Language

Publisher

Rights

Subject

Type

Date

Original context has metadata only

Harvested from