Corpus contains recordings of communication between air traffic controllers and pilots. The speech is manually transcribed and labeled with the information about the speaker (pilot/controller, not the full identity of the person). The corpus is currently small (20 hours) but we plan to search for additional data next year. The audio data format is: 8kHz, 16bit PCM, mono. and Technology Agency of the Czech Republic, project No. TA01030476.
The database actually contains two sets of recordings, both recorded in the moving or stationary vehicles (passenger cars or trucks). All data were recorded within the project “Intelligent Electronic Record of the Operation and Vehicle Performance” whose aim is to develop a voice-operated software for registering the vehicle operation data.
The first part (full_noises.zip) consists of relatively long recordings from the vehicle cabin, containing spontaneous speech from the vehicle crew. The recordings are accompanied with detailed transcripts in the Transcriber XML-based format (.trs). Due to the recording settings, the audio contains many different noises, only sparsely interspersed with speech. As such, the set is suitable for robust estimation of the voice activity detector parameters.
The second set (prompts.zip) consists of short prompts that were recorded in the controlled setting – the speakers either answered simple questions or they repeated commands and short phrases. The prompts were recorded by 26 different speakers. Each speaker recorded at least two sessions (with identical set of prompts) – first in stationary vehicle, with low level of noise (those recordings are marked by –A_ in the file name) and second while actually driving the car (marked by –B_ or, since several speakers recorded 3 sessions, by –C_). The recordings from this set are suitable mostly for training of the robust domain-specific speech recognizer and also ASR test purposes.