This small dataset contains 3 speech corpora collected using the Alex Translate telephone service (https://ufal.mff.cuni.cz/alex#alex-translate).
The "part1" and "part2" corpora contain English speech with transcriptions and Czech translations. These recordings were collected from users of the service. Part 1 contains earlier recordings, filtered to include only clean speech; Part 2 contains later recordings with no filtering applied.
The "cstest" corpus contains recordings of artificially created sentences, each containing one or more Czech names of places in the Czech Republic. These were recorded by a multinational group of students studying in Prague.
Talks of Karel Makoň given to his friends in the course of late sixties through early nineties of the 20th century. The topic is mostly christian mysticism.
This is the Czech data collected during the `VYSTADIAL` project. It is an extension of the 'Vystadial 2013' Czech part data release. The dataset comprises of telephone conversations in Czech, developed for training acoustic models for automatic speech recognition in spoken dialogue systems.