Test data for the WMT 2017 Automatic post-editing task (the same used for the Sentence-level Quality Estimation task). They consist in German-English triplets (source and target) belonging to the pharmacological domain and already tokenized. Test set contains 2,000 pairs. All data is provided by the EU project QT21 (http://www.qt21.eu/).
Test data for the WMT 2017 Automatic post-editing task (the same used for the Sentence-level Quality Estimation task). They consist in 2,000 English-German pairs (source and target) belonging to the IT domain and already tokenized. All data is provided by the EU project QT21 (http://www.qt21.eu/).
Test data for the WMT 2018 Automatic post-editing task. They consist in English-German pairs (source and target) belonging to the information technology domain and already tokenized. Test set contains 1,023 pairs. A neural machine translation system has been used to generate the target segments. All data is provided by the EU project QT21 (http://www.qt21.eu/).
Test data for the WMT 2018 Automatic post-editing task. They consist in English-German pairs (source and target) belonging to the information technology domain and already tokenized. Test set contains 2,000 pairs. A phrase-based machine translation system has been used to generate the target segments. This test set is sampled from the same dataset used for the 2016 and 2017 APE shared task editions. All data is provided by the EU project QT21 (http://www.qt21.eu/).
The ACL RD-TEC 2.0 has been developed with the aim of providing a benchmark for the evaluation of methods for terminology extraction and classification as well as entity recognition tasks based on specialised text from the computational linguistics domain. This release of the corpus consists of 300 abstracts from articles in the ACL Anthology Reference Corpus, published between 1978--2006. In these abstracts, terms (i.e., single or multi-word lexical units with a specialised meaning) are manually annotated. In addition to their boundaries in running text, annotated terms are classified into one of the seven categories method, tool, language resource (LR), LR product, model, measures and measurements, and other. To assess the quality of the annotations and to determine the difficulty of this task, more than 171 of the abstracts are annotated twice, independently, by each of the two annotators. In total, 6,818 terms are identified and annotated, resulting in a specialised vocabulary made of 3,318 lexical forms, mapped to 3,471 concepts.
Unused film material shot for Český zvukový týdeník Aktualita (Czech Aktualita Sound Newsreel) segment issue no. 21A from 1943 captures the mood of a training course organised by the Board of Trustees for the Education of Youth in the Prachov Camp at the Prachov Rocks in May 1943. In addition to sports events, the programme for teenagers included lectures.
Segment from Český zvukový týdeník Aktualita (Czech Aktualita Sound Newsreel) issue no. 25B from 1943, shot on 30 May, shows Prime Minister Jaroslav Krejčí, Minister of Education and People´s Enlightenment and Chairman of the Board Emanuel Moravec, General Secretary of the Board František Teuner and other guests of honour during their visit to a recreation camp for working youth aged 14-18, which was held in Semice nad Lužnicí near Týn nad Vltavou. Just as in nearby Protivín, young apprentices put on a collective sports performance for them.
Segment from Český zvukový týdeník Aktualita (Czech Aktualita Sound Newsreel) issue no. 33B from 1943 captures life at a summer camp for working girls organised by the Board of Trustees for the Education of Youth in Poddoubí in the Semily District. Just as in the boys´ camps, the main objective was to improve physical fitness. Sports activities were complemented by periods of relaxation.
Segment from Český zvukový týdeník Aktualita (Czech Aktualita Sound Newsreel) issue no. 1A from 1945 was shot during the boys´ swimming courses organised by the Board of Trustees for the Education of Youth and held in the Axa Palace swimming pool. The training included teaching first aid for rescuing a drowning person.
Segment from Český zvukový týdeník Aktualita (Czech Aktualita Sound Newsreel) issue no. 28B from 1943 contains footage from the Youth Swimming Championship for Ages 14-18 organised by the Board of Trustees for the Education of Youth and held at a swimming pool in Prague-Barrandov on 3 an 4 July. The participants consisted of 250 junior athletes who qualified in district and provincial races. Trophies were presented to the winners by General Secretary of the Board František Teuner.