Contributor: European Union@@FP6-IST-5-034434-IP@@Companions IP@@euFunds@@ / Creator: Hajič, Jan and Mareček, David

Start Over Contributor European Union@@FP6-IST-5-034434-IP@@Companions IP@@euFunds@@ Creator Hajič, Jan Creator Mareček, David

1. Prague Dependency Treebank of Spoken Czech 2.0 (PDTSC 2.0)

Creator:: Mikulová, Marie, Bémová, Alevtina, Hajič, Jan, Hajičová, Eva, Ircing, Pavel, Kolářová, Veronika, Lopatková, Markéta, Mareček, David, Mírovský, Jiří, Nedoluzhko, Anna, Pajas, Petr, Panevová, Jarmila, Peterek, Nino, Romportl, Jan, Sgall, Petr, Ševčíková, Magda, Štěpánek, Jan, Urešová, Zdeňka, and Žabokrtský, Zdeněk
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: spoken corpus, speech reconstruction, speech recognition, syntax, semantics, coreference, and audio
Language:: Czech
Description:: The Prague Dependency Treebank of Spoken Czech 2.0 (PDTSC 2.0) is a corpus of spoken language, consisting of 742,316 tokens and 73,835 sentences, representing 7,324 minutes (over 120 hours) of spontaneous dialogs. The dialogs have been recorded, transcribed and edited in several interlinked layers: audio recordings, automatic and manual transcripts and manually reconstructed text. These layers were part of the first version of the corpus (PDTSC 1.0). Version 2.0 is extended by an automatic dependency parser at the analytical and by the manual annotation of “deep” syntax at the tectogrammatical layer, which contains semantic roles and relations as well as annotation of coreference.
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

2. Prague Dependency Treebank of Spoken Language (PDTSL) 0.5

Creator:: Hajič, Jan, Pajas, Petr, Mareček, David, Mikulová, Marie, Urešová, Zdeňka, and Podveský, Petr
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: audio and corpus
Subject:: corpus and spoken language
Language:: Czech and English
Description:: The first edition of a speech corpus with a speech reconstruction layer (edited transcript). The project of speech reconstruction of Czech and English has been started at UFAL together with the PIRE project in 2005, and has gradually grown from ideas to (first) annotation specification, annotation software and actual annotation. It is part of the Prague Dependency Treebank family of annotated corpus resources and tools, to which it adds the spoken language layer(s). and LC536; MSM0021620838; IST-034344; ME838
Rights:: PDTSL, https://lindat.mff.cuni.cz/repository/xmlui/page/licence-pdtsl, and ACA

1. Prague Dependency Treebank of Spoken Czech 2.0 (PDTSC 2.0)

2. Prague Dependency Treebank of Spoken Language (PDTSL) 0.5

Limit your search

Show values starting with

Show values starting with

Search

Search Constraints

Search Results

Limit your search

Contributor

Show values starting with

Creator

Show values starting with

Language

Publisher

Rights

Subject

Type

Date

Original context has metadata only

Harvested from