The segment of Československý zvukový týdeník Aktualita (Czechoslovak Aktualita Sound Newsreel), 1938, issue no. 38 offers an excerpt from the radio speech delivered by President Edvard Beneš on 10 September 1938, in which he addresses the German minority in Czechoslovakia.
Extremely fast digital audio channelizer implementation, usable as a building block for experimental ASR front-ends or signal denoising applications. Also applicable in software defined radios, due to its high throughput. It comes in a form of a C/C++ library and an executable example program which reads input stream, splitting it into equidistant frequency channels, emitting their data to the output.
Features:
(1) Hand tuned SIMD-aware assembly for x86 (SSE) and IA64 (AVX) as well as for ARM (NEON) processors.
(2) Generic non-SIMD C++ implementation for other architectures.
(3) Capable of taking advantage of multicore CPUs.
(4) Fully configurable number of channels and the output decimation rate.
(5) User supplied FIR of the channel separation filter, which allows to specify the width of the channels, whether they should overlap or be separated.
(6) Input and output signal samples are treated as complex numbers.
(7) Speed over 750 complex MS/s achieved on Core i7 4710HQ @ 2.5GHz, when channelizing into 72 output channels with a FIR length of 1152 samples, using 3 computing threads.
(8) Runs under Linux OS.
Journalist Egon Erwin Kisch in edited footage from post-war newsreel segments of his obituary in Týden ve filmu (Week in Film) 1948, issue no. 15. Kisch with the Mayor of Prague Václav Vacek at an exhibition.
The segment of Československý zvukový týdeník Aktualita (Czechoslovak Aktualita Sound Newsreel), 1938, issue no. 52 reports on the elections to the Slovak Land Assembly, the highest legislative body of autonomous Slovakia within the framework of the Second Czechoslovak Republic, which were held on 18 December 1938. The footage shows preparations for the distribution of ballots at the Town Hall in Bratislava.
ELITR Minuting Corpus consists of transcripts of meetings in Czech and English, their manually created summaries ("minutes") and manual alignments between the two.
Czech meetings are in the computer science and public administration domains and English meetings are in the computer science domain.
Each transcript has one or multiple corresponding minutes files. Alignments are only provided for a portion of the data.
This corpus contains 59 Czech and 120 English meeting transcripts, consisting of 71097 and 87322 dialogue turns respectively. For Czech meetings, we provide 147 total minutes with 55 of them aligned. For English meetings, it is 256 total minutes with 111 of them aligned.
Please find a more detailed description of the data in the included README and stats.tsv files.
If you use this corpus, please cite:
Nedoluzhko, A., Singh, M., Hledíková, M., Ghosal, T., and Bojar, O.
(2022). ELITR Minuting Corpus: A novel dataset for automatic minuting
from multi-party meetings in English and Czech. In Proceedings of the
13th International Conference on Language Resources and Evaluation
(LREC-2022), Marseille, France, June. European Language Resources
Association (ELRA). In print.
@inproceedings{elitr-minuting-corpus:2022,
author = {Anna Nedoluzhko and Muskaan Singh and Marie
Hled{\'{\i}}kov{\'{a}} and Tirthankar Ghosal and Ond{\v{r}}ej Bojar},
title = {{ELITR} {M}inuting {C}orpus: {A} Novel Dataset for
Automatic Minuting from Multi-Party Meetings in {E}nglish and {C}zech},
booktitle = {Proceedings of the 13th International Conference
on Language Resources and Evaluation (LREC-2022)},
year = 2022,
month = {June},
address = {Marseille, France},
publisher = {European Language Resources Association (ELRA)},
note = {In print.}
}