1 - 5 of 5
Number of results to display per page
Search Results
2. Prague Dependency Treebank - Consolidated 1.0 (PDT-C 1.0)
- Creator:
- Hajič, Jan, Bejček, Eduard, Bémová, Alevtina, Buráňová, Eva, Fučíková, Eva, Hajičová, Eva, Havelka, Jiří, Hlaváčová, Jaroslava, Homola, Petr, Ircing, Pavel, Kárník, Jiří, Kettnerová, Václava, Klyueva, Natalia, Kolářová, Veronika, Kučová, Lucie, Lopatková, Markéta, Mareček, David, Mikulová, Marie, Mírovský, Jiří, Nedoluzhko, Anna, Novák, Michal, Pajas, Petr, Panevová, Jarmila, Peterek, Nino, Poláková, Lucie, Popel, Martin, Popelka, Jan, Romportl, Jan, Rysová, Magdaléna, Semecký, Jiří, Sgall, Petr, Spoustová, Johanka, Straka, Milan, Straňák, Pavel, Synková, Pavlína, Ševčíková, Magda, Šindlerová, Jana, Štěpánek, Jan, Štěpánková, Barbora, Toman, Josef, Urešová, Zdeňka, Vidová Hladká, Barbora, Zeman, Daniel, Zikánová, Šárka, and Žabokrtský, Zdeněk
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- treebank, dependency, tectogrammatics, topic-focus articulation, multiword expressions, coreference, bridging relations, discourse, morphology, syntax, tokenization, lemmatization, semantic relations, lexical semantics, lexicon, valency, speech reconstruction, clauses, speech recognition, and spoken corpus
- Language:
- Czech
- Description:
- A richly annotated and genre-diversified language resource, The Prague Dependency Treebank – Consolidated 1.0 (PDT-C 1.0, or PDT-C in short in the sequel) is a consolidated release of the existing PDT-corpora of Czech data, uniformly annotated using the standard PDT scheme. PDT-corpora included in PDT-C: Prague Dependency Treebank (the original PDT contents, written newspaper and journal texts from three genres); Czech part of Prague Czech-English Dependency Treebank (translated financial texts, from English), Prague Dependency Treebank of Spoken Czech (spoken data, including audio and transcripts and multiple speech reconstruction annotation); PDT-Faust (user-generated texts). The difference from the separately published original treebanks can be briefly described as follows: it is published in one package, to allow easier data handling for all the datasets; the data is enhanced with a manual linguistic annotation at the morphological layer and new version of morphological dictionary is enclosed; a common valency lexicon for all four original parts is enclosed. Documentation provides two browsing and editing desktop tools (TrEd and MEd) and the corpus is also available online for searching using PML-TQ.
- Rights:
- Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB
3. Prague Dependency Treebank 3.5
- Creator:
- Hajič, Jan, Bejček, Eduard, Bémová, Alevtina, Buráňová, Eva, Hajičová, Eva, Havelka, Jiří, Homola, Petr, Kárník, Jiří, Kettnerová, Václava, Klyueva, Natalia, Kolářová, Veronika, Kučová, Lucie, Lopatková, Markéta, Mikulová, Marie, Mírovský, Jiří, Nedoluzhko, Anna, Pajas, Petr, Panevová, Jarmila, Poláková, Lucie, Rysová, Magdaléna, Sgall, Petr, Spoustová, Johanka, Straňák, Pavel, Synková, Pavlína, Ševčíková, Magda, Štěpánek, Jan, Urešová, Zdeňka, Vidová Hladká, Barbora, Zeman, Daniel, Zikánová, Šárka, and Žabokrtský, Zdeněk
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- treebank, dependency, tectogrammatics, topic-focus articulation, multiword expressions, coreference, bridging relations, discourse, morphology, syntax, tokenization, lemmatization, clauses, semantics, semantic relations, lexical semantics, and lexicon
- Language:
- Czech
- Description:
- The Prague Dependency Treebank 3.5 is the 2018 edition of the core Prague Dependency Treebank (PDT). It contains all PDT annotation made at the Institute of Formal and Applied Linguistics under various projects between 1996 and 2018 on the original texts, i.e., all annotation from PDT 1.0, PDT 2.0, PDT 2.5, PDT 3.0, PDiT 1.0 and PDiT 2.0, plus corrections, new structure of basic documentation and new list of authors covering all previous editions. The Prague Dependency Treebank 3.5 (PDT 3.5) contains the same texts as the previous versions since 2.0; there are 49,431 annotated sentences (832,823 words) on all layers, from tectogrammatical annotation to syntax to morphology. There are additional annotated sentences for syntax and morphology; the totals for the lower layers of annotation are: 87,913 sentences with 1,502,976 words at the analytical layer (surface dependency syntax) and 115,844 sentences with 1,956,693 words at the morphological layer of annotation (these totals include the annotation with the higher layers annotated as well). Closely linked to the tectogrammatical layer is the annotation of sentence information structure, multiword expressions, coreference, bridging relations and discourse relations.
- Rights:
- Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB
4. Prague Discourse Treebank 2.0
- Creator:
- Rysová, Magdaléna, Synková, Pavlína, Mírovský, Jiří, Hajičová, Eva, Nedoluzhko, Anna, Ocelák, Radek, Pergler, Jiří, Poláková, Lucie, Scheller, Veronika, Zdeňková, Jana, and Zikánová, Šárka
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- discourse, bridging relations, coreference, topic-focus articulation, treebank, dependency, tectogrammatics, and PDT
- Language:
- Czech
- Description:
- PDiT 2.0 is a new version of the Prague Discourse Treebank. It contains a complex annotation of discourse phenomena enriched by the annotation of secondary connectives.
- Rights:
- Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB
5. Prague Discourse Treebank 3.0
- Creator:
- Synková, Pavlína, Rysová, Magdaléna, Mírovský, Jiří, Poláková, Lucie, Sheller, Veronika, Zdeňková, Jana, Zikánová, Šárka, and Hajičová, Eva
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- discourse, discourse annotation, treebank, PDT, and tectogrammatics
- Language:
- Czech
- Description:
- The Prague Discourse Treebank 3.0 (PDiT 3.0) is a new version of annotation of discourse relations marked by primary and secondary discourse connectives in the data of the Prague Dependency Treebank. With respect to the previous versions, PDiT 3.0 brings a largely revised annotation of discourse relations and offers the data also in the Penn Discourse Treebank 3.0 (PDTB 3.0) format and sense taxonomy.
- Rights:
- Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB