Training data for the WMT 2017 Automatic post-editing task (the same used for the Sentence-level Quality Estimation task). They consist in 11,000 English-German triplets (source, target and post-edit) belonging to the IT domain and already tokenized. All data is provided by the EU project QT21 (http://www.qt21.eu/).
Training and development data for the WMT 2018 Automatic post-editing task. They consist in English-German triplets (source, target and post-edit) belonging to the information technology domain and already tokenized. Training and development respectively contain 13,442 and 1,000 triplets. A neural machine translation system has been used to generate the target segments. All data is provided by the EU project QT21 (http://www.qt21.eu/).