Creator: Komrsková, Zuzana / Data provider: Academy of Sciences Library/Knihovna Akademie věd ČR / Harvested from: CDK and CDK

Start Over Creator Komrsková, Zuzana Harvested from CDK Harvested from CDK Data provider Academy of Sciences Library/Knihovna Akademie věd ČR

1. Korpus ORAL: sestavení, lemmatizace a morfologické značkování

Creator:: Kopřivová, Marie, Komrsková, Zuzana, Lukeš, David, and Poukarová, Petra
Format:: bez média and svazek
Type:: model:article and TEXT
Subject:: spoken Czech, spoken language corpora, lemmatization, tagging, morphological analysis, mluvená čeština, korpusy mluveného jazyka, lemmatizace, tagování, and morfologická analýza
Language:: Czech
Description:: The goal of this paper is to provide an overview of the structure and contents of the soon-to-be available ORAL corpus, which combines previously published corpora (ORAL2006, ORAL2008 and ORAL2013) with newly transcribed material into a single conveniently accessible and more richly annotated resource, about 6 million running words in length. The recordings and corresponding transcripts span a decade between 2002 and 2011; most of them capture interactions of mutually well-acquainted speakers, in informal situations and natural settings. The corpus is complemented by amarginal portion of more formal data, mostly public talks. It is tagged and lemmatized, and an effort was made to adapt existing tools (targeted at written language) to yield better results on spoken data. We hope the availability of such a resource will spawn further discussions on the morphological and syntactic analysis of spoken language, perhaps resulting in more radical departures in the future from the part-of-speech classification inherited from the linguistic analysis of written language.
Rights:: http://creativecommons.org/publicdomain/mark/1.0/ and policy:public

2. Reprodukce řeči/myšlení v mluvených projevech jako předmět korpusového výzkumu

Creator:: Hoffmannová, Jana, Komrsková, Zuzana, and Poukarová, Petra
Format:: bez média and svazek
Type:: model:article and TEXT
Subject:: reported speech, reported thought, introductory construction, direct/indirect/ free indirect speech/thought, reproduction of one’s own speech, reproduction of the speech of others, reprodukce řeči, reprodukce myšlení, rámcový segment, přímá/nepřímá/ polopřímá řeč/myšlení, reprodukce řeči vlastní, and reprodukce řeči cizí
Language:: Czech
Description:: The article explores reported speech/thought in spoken Czech, especially reproductions introduced with various forms of říct/říkat (to say), with data provided by the Czech National Corpus. Most reproductions were introduced by the imperfective verb říkat (past and present tenses, first and third persons). By contrast, reproductions of thought were much less numerous and almost invariably involved the first person. We found twice as many examples of direct speech than indirect speech, and interesting transitional forms, some of which can be described as free indirect speech. Pauses separating introductory constructions from reproductions appear to be more typical of direct than indirect speech, but are generally infrequent, suggesting a lower degree of segmentation of spoken language. Sometimes, reproductions of the speech of others were signalled with reduced introductory constructions, with verba dicendi substituted by signals other than verbs, whereas reproductions of one’s own speech were normally introduced with a verbum dicendi.
Rights:: http://creativecommons.org/publicdomain/mark/1.0/ and policy:public

1. Korpus ORAL: sestavení, lemmatizace a morfologické značkování

2. Reprodukce řeči/myšlení v mluvených projevech jako předmět korpusového výzkumu

Limit your search

Show values starting with

Search

Search Constraints

Search Results

Limit your search

Coverage

Creator

Format

Language

Rights

Subject

Show values starting with

Type

Original context has metadata only

Harvested from