Harvested from: LINDAT/CLARIAH-CZ repository / Language: Czech / Type: audio

Start Over Language Czech Type audio Harvested from LINDAT/CLARIAH-CZ repository

21. Spoken corpus of Karel Makoň (2020-11-16)

Creator:: Krůza, Jan Oldřich
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: audio and corpus
Subject:: single speaker, christianity, mystic, and speech corpus
Language:: Czech
Description:: Talks of Karel Makoň given to his friends in the course of late sixties through early nineties of the 20th century. The topic is mostly christian mysticism.
Rights:: Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0), http://creativecommons.org/licenses/by-sa/4.0/, and PUB

22. STAZKA – Speech recordings from vehicles

Creator:: Šmídl, Luboš, Stanislav, Petr, and Radová, Vlasta
Publisher:: University of West Bohemia, Department of Cybernetics
Type:: audio and corpus
Subject:: speech corpus, noisy speech, voice activity detector, and speech recognition
Language:: Czech
Description:: The database actually contains two sets of recordings, both recorded in the moving or stationary vehicles (passenger cars or trucks). All data were recorded within the project “Intelligent Electronic Record of the Operation and Vehicle Performance” whose aim is to develop a voice-operated software for registering the vehicle operation data. The first part (full_noises.zip) consists of relatively long recordings from the vehicle cabin, containing spontaneous speech from the vehicle crew. The recordings are accompanied with detailed transcripts in the Transcriber XML-based format (.trs). Due to the recording settings, the audio contains many different noises, only sparsely interspersed with speech. As such, the set is suitable for robust estimation of the voice activity detector parameters. The second set (prompts.zip) consists of short prompts that were recorded in the controlled setting – the speakers either answered simple questions or they repeated commands and short phrases. The prompts were recorded by 26 different speakers. Each speaker recorded at least two sessions (with identical set of prompts) – first in stationary vehicle, with low level of noise (those recordings are marked by –A_ in the file name) and second while actually driving the car (marked by –B_ or, since several speakers recorded 3 sessions, by –C_). The recordings from this set are suitable mostly for training of the robust domain-specific speech recognizer and also ASR test purposes.
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

23. Vystadial 2013 – Czech data

Creator:: Korvas, Matěj, Plátek, Ondřej, Dušek, Ondřej, Žilka, Lukáš, and Jurčíček, Filip
Publisher:: Charles University, Faculty of Mathematics and Physics
Type:: audio and corpus
Subject:: acoustic data, speech corpus, spoken corpus, orthographic transcriptions, telephone speech, voip, and dialogue system
Language:: Czech
Description:: Vystadial 2013 is a dataset of telephone conversations in English and Czech, developed for training acoustic models for automatic speech recognition in spoken dialogue systems. It ships in three parts: Czech data, English data, and scripts. The data comprise over 41 hours of speech in English and over 15 hours in Czech, plus orthographic transcriptions. The scripts implement data pre-processing and building acoustic models using the HTK and Kaldi toolkits. This is the Czech data part of the dataset. and This research was funded by the Ministry of Education, Youth and Sports of the Czech Republic under the grant agreement LK11221.
Rights:: Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0), http://creativecommons.org/licenses/by-sa/3.0/, and PUB

24. Vystadial 2016 – Czech data

Creator:: Plátek, Ondřej, Dušek, Ondřej, and Jurčíček, Filip
Publisher:: Charles University, Faculty of Mathematics and Physics
Type:: audio and corpus
Subject:: acoustic data, speech corpus, spoken corpus, telephone speech, voip, and dialogue system
Language:: Czech
Description:: This is the Czech data collected during the `VYSTADIAL` project. It is an extension of the 'Vystadial 2013' Czech part data release. The dataset comprises of telephone conversations in Czech, developed for training acoustic models for automatic speech recognition in spoken dialogue systems.
Rights:: Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0), http://creativecommons.org/licenses/by-sa/4.0/, and PUB

21. Spoken corpus of Karel Makoň (2020-11-16)

22. STAZKA – Speech recordings from vehicles

23. Vystadial 2013 – Czech data

24. Vystadial 2016 – Czech data

Limit your search

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Search

Search Constraints

Search Results

Limit your search

Contributor

Show values starting with

Creator

Show values starting with

Language

Publisher

Rights

Show values starting with

Subject

Show values starting with

Type

Date

Original context has metadata only

Harvested from