Training and development data for the WMT 2017 Automatic post-editing task (the same used for the Sentence-level Quality Estimation task). They consist in German-English triplets (source, target and post-edit) belonging to the pharmacological domain and already tokenized. Training and development respectively contain 25,000 and 1,000 triplets. All data is provided by the EU project QT21 (http://www.qt21.eu/).
Training data for the WMT 2017 Automatic post-editing task (the same used for the Sentence-level Quality Estimation task). They consist in 11,000 English-German triplets (source, target and post-edit) belonging to the IT domain and already tokenized. All data is provided by the EU project QT21 (http://www.qt21.eu/).
Training and development data for the WMT17 QE task. Test data will be published as a separate item.
This shared task will build on its previous five editions to further examine automatic methods for estimating the quality of machine translation output at run-time, without relying on reference translations. We include word-level, phrase-level and sentence-level estimation. All tasks will make use of a large dataset produced from post-editions by professional translators. The data will be domain-specific (IT and Pharmaceutical domains) and substantially larger than in previous years. In addition to advancing the state of the art at all prediction levels, our goals include:
- To test the effectiveness of larger (domain-specific and professionally annotated) datasets. We will do so by increasing the size of one of last year's training sets.
- To study the effect of language direction and domain. We will do so by providing two datasets created in similar ways, but for different domains and language directions.
- To investigate the utility of detailed information logged during post-editing. We will do so by providing post-editing time, keystrokes, and actual edits.
This year's shared task provides new training and test datasets for all tasks, and allows participants to explore any additional data and resources deemed relevant. A in-house MT system was used to produce translations for all tasks. MT system-dependent information can be made available under request. The data is publicly available but since it has been provided by our industry partners it is subject to specific terms and conditions. However, these have no practical implications on the use of this data for research purposes.
Test data for the WMT17 QE task. Train data can be downloaded from http://hdl.handle.net/11372/LRT-1974
This shared task will build on its previous five editions to further examine automatic methods for estimating the quality of machine translation output at run-time, without relying on reference translations. We include word-level, phrase-level and sentence-level estimation. All tasks will make use of a large dataset produced from post-editions by professional translators. The data will be domain-specific (IT and Pharmaceutical domains) and substantially larger than in previous years. In addition to advancing the state of the art at all prediction levels, our goals include:
- To test the effectiveness of larger (domain-specific and professionally annotated) datasets. We will do so by increasing the size of one of last year's training sets.
- To study the effect of language direction and domain. We will do so by providing two datasets created in similar ways, but for different domains and language directions.
- To investigate the utility of detailed information logged during post-editing. We will do so by providing post-editing time, keystrokes, and actual edits.
This year's shared task provides new training and test datasets for all tasks, and allows participants to explore any additional data and resources deemed relevant. A in-house MT system was used to produce translations for all tasks. MT system-dependent information can be made available under request. The data is publicly available but since it has been provided by our industry partners it is subject to specific terms and conditions. However, these have no practical implications on the use of this data for research purposes.
Training and development data for the WMT 2018 Automatic post-editing task. They consist in English-German triplets (source, target and post-edit) belonging to the information technology domain and already tokenized. Training and development respectively contain 13,442 and 1,000 triplets. A neural machine translation system has been used to generate the target segments. All data is provided by the EU project QT21 (http://www.qt21.eu/).
Test data for the WMT18 QE task. Train data can be downloaded from http://hdl.handle.net/11372/LRT-2619.
This shared task will build on its previous six editions to further examine automatic methods for estimating the quality of machine translation output at run-time, without relying on reference translations. We include word-level, phrase-level and sentence-level estimation. All tasks make use of datasets produced from post-editions by professional translators. The datasets are domain-specific (IT and life sciences/pharma domains) and extend from those used previous years with more instances and more languages. One important addition is that this year we also include datasets with neural MT outputs. In addition to advancing the state of the art at all prediction levels, our specific goals are:
To study the performance of quality estimation approaches on the output of neural MT systems. We will do so by providing datasets for two language language pairs where the same source segments are translated by both a statistical phrase-based and a neural MT system.
To study the predictability of deleted words, i.e. words that are missing in the MT output. TO do so, for the first time we provide data annotated for such errors at training time.
To study the effectiveness of explicitly assigned labels for phrases. We will do so by providing a dataset where each phrase in the output of a phrase-based statistical MT system was annotated by human translators.
To study the effect of different language pairs. We will do so by providing datasets created in similar ways for four language language pairs.
To investigate the utility of detailed information logged during post-editing. We will do so by providing post-editing time, keystrokes, and actual edits.
Measure progress over years at all prediction levels. We will do so by using last year's test set for comparative experiments.
In-house statistical and neural MT systems were built to produce translations for all tasks. MT system-dependent information can be made available under request. The data is publicly available but since it has been provided by our industry partners it is subject to specific terms and conditions. However, these have no practical implications on the use of this data for research purposes. Participants are allowed to explore any additional data and resources deemed relevant.
Training and development data for the WMT18 QE task. Test data will be published as a separate item.
This shared task will build on its previous six editions to further examine automatic methods for estimating the quality of machine translation output at run-time, without relying on reference translations. We include word-level, phrase-level and sentence-level estimation. All tasks make use of datasets produced from post-editions by professional translators. The datasets are domain-specific (IT and life sciences/pharma domains) and extend from those used previous years with more instances and more languages. One important addition is that this year we also include datasets with neural MT outputs. In addition to advancing the state of the art at all prediction levels, our specific goals are:
To study the performance of quality estimation approaches on the output of neural MT systems. We will do so by providing datasets for two language language pairs where the same source segments are translated by both a statistical phrase-based and a neural MT system.
To study the predictability of deleted words, i.e. words that are missing in the MT output. TO do so, for the first time we provide data annotated for such errors at training time.
To study the effectiveness of explicitly assigned labels for phrases. We will do so by providing a dataset where each phrase in the output of a phrase-based statistical MT system was annotated by human translators.
To study the effect of different language pairs. We will do so by providing datasets created in similar ways for four language language pairs.
To investigate the utility of detailed information logged during post-editing. We will do so by providing post-editing time, keystrokes, and actual edits.
Measure progress over years at all prediction levels. We will do so by using last year's test set for comparative experiments.
In-house statistical and neural MT systems were built to produce translations for all tasks. MT system-dependent information can be made available under request. The data is publicly available but since it has been provided by our industry partners it is subject to specific terms and conditions. However, these have no practical implications on the use of this data for research purposes. Participants are allowed to explore any additional data and resources deemed relevant.
Wnt/β-catenin signaling is involved in virtually every aspect of embryonic development and also controls homeostatic selfrenewal in a number of adult tissues. Recently, emerging evidence from researches of organ fibrosis suggest that sustained Wnt/β-catenin pathway reactivation is linked to the pathogenesis of fibrotic disorders. Here we focus on Wnt/β-catenin-related pathogenic effects in different organs, such as lung fibrosis, liver fibrosis, skin fibrosis and renal fibrosis. Additionally, Wnt/β- catenin signaling works in a combinatorial manner with TGF-β signaling in the process of fibrosis, and TGF-β signaling can induce expression of Wnt/β-catenin superfamily members and vice versa. Moreover, network analysis, based on pathway databases, revealed that key factors in the Wnt pathway were targeted by some differentially expressed microRNAs detected in fibrosis diseases. These findings demonstrated the crosstalks between Wnt/β-catenin pathway and TGF-β signalings, and microRNAs, highlighting the role of Wnts in organ fibrogenesis. Most importantly, nowadays there is a variety of Wnt pathway inhibitors which give us the potential therapeutic feasibility, modulation of the Wnt pathway may, therefore, present as a suitable and promising therapeutic strategy in the future., Y. Guo ... [et al.]., and Obsahuje seznam literatury
Wolbachia is a maternally transmitted intracellular symbiont which causes reproductive distortions in the arthropods it infects. In recent years there has been an increasing interest in using Wolbachia as a potential tool for biological control by genetic manipulation of insect pests. In the present paper we report Wolbachia infection in several Trissolcus wasps (Hymenoptera: Scelionidae) which are important egg parasitoids of the sunn pest, Eurygaster integriceps Puton (Heteroptera: Scutellaridae). We used DNA sequence data for a gene encoding a surface protein of Wolbachia (wsp) not only to confirm Wolbachia infection but also to discriminate Wolbachia strains. Phylogenetic analyses indicated that Wolbachia strains in Trissolcus species were closely related to one another and belonged to supergroup B. Determination of the infection status of various populations, the possible role of Wolbachia in causing the incompatibility and knowledge of the reproductive compatibility of Trissolcus populations is important for the success of parasitoids in sunn pest management., Nurper Guz ... [et al.]., and Obsahuje seznam literatury
Wolbachia pipientis (Hertig) (Rickettsiaceae) is an endocellular bacterium infecting numerous species of arthropods. The bacterium is harboured by males and females but is only transmitted maternally because spermatocytes shed their Wolbachia during maturation. The presence of this endosymbiont can lead to feminisation of the host, parthenogenesis, male-killing or reproductive incompatibility called cytoplasmic incompatibility (CI). Although Wolbachia transmission is exclusively maternal, phylogenetic evidence indicates that very rare inter-species transmission events have taken place. Horizontal transmission is possible in the laboratory by transferring cytoplasm from infected to uninfected eggs. Using this technique, we have artificially infected lines of the fruit fly Drosophila simulans Sturtevant (Drosophilidae). Recipient lines came from two different D. simulans populations. One ("naive" host) is not infected in the wild. The other ("usual" host) is a population naturally carrying Wolbachia in the wild. In this second case, recipient flies used in the experiment came from a stock culture that had been cured off its infection beforehand by an antibiotic treatment. Infected D. simulans laboratory stocks were used as donors. We assessed the three following parameters: (i) trans-infection success rate (ratio of infected over total female zygote having survived the injection), (ii) level of cytoplasmic incompatibility expressed by trans-infected males three generations post-trans-infection, and (iii) infection loss rate over time in trans-infected lines (percentage of lines having lost the infection after 20 to 40 generations). We observed that parameter (i) did not differ significantly whether the recipient line came from a "naive" or a "usual" host population. However, both (ii) and (iii) were significantly higher in the "naive" trans-infected stock, which is in agreement with earlier theoretical considerations.