The THEaiTRobot 2.0 tool allows the user to interactively generate scripts for individual theatre play scenes.
The previous version of the tool (http://hdl.handle.net/11234/1-3507) was based on GPT-2 XL generative language model, using the model without any fine-tuning, as we found that with a prompt formatted as a part of a theatre play script, the model usually generates continuation that retains the format.
The current version also uses vanilla GPT-2 by default, but can also instead use a GPT-2 medium model fine-tuned on theatre play scripts (as well as film and TV series scripts). Apart from the basic "flat" generation using a theatrical starting prompt and the script model, the tool also features a second, hierarchical variant, where in the first step, a play synopsis is generated from its title using a synopsis model (GPT-2 medium fine-tuned on synopses of theatre plays, as well as film, TV series and book synopses). The synopsis is then used as input for the second stage, which uses the script model.
The choice of models to use is done by setting the MODEL variable in start_server.sh and start_syn_server.sh
THEaiTRobot 2.0 was used to generate the second THEaiTRE play, "Permeation/Prostoupení".
Actor Theodor Pištěk with his colleagues Alfréd Schleisnger and Marie (Máňa) Ženíšková in Takový je život (Such Is Life, dir. Carl Junghans, 1929). Theodor Pištěk in Cikáni (Gypsies, dir. Karel Anton, 1921). Pištěk putting on make-up. Pištěk with his portrait carved in glass in a segment from Československý zvukový týdeník Aktualita (Czechoslovak Aktualita Sound Newsreel) 1942, issue no. 28. Pištěk With his daughter-in-law Věra Filipová Pištěková on Bohumil Veselý's balcony.
AMALACH project component TMODS:ENG-CZE; machine translation of queries from Czech to English. This archive contains models for the Moses decoder (binarized, pruned to allow for real-time translation) and configuration files for the MTMonkey toolkit. The aim of this package is to provide a full service for Czech->English translation which can be easily utilized as a component in a larger software solution. (The required tools are freely available and an installation guide is included in the package.)
The translation models were trained on CzEng 1.0 corpus and Europarl. Monolingual data for LM estimation additionally contains WMT news crawls until 2013.
Segment from Český zvukový týdeník Aktualita (Czech Aktualita Sound Newsreel) issue no. 23A from 1943 was shot on 30 May during the official opening of a training camp organised by the Board of Trustees for the Education of Youth at Protivín Chateau. The ceremony was held to mark the first anniversary of the Board. The importance of the event was highlighted by the presence of Prime Minister Jaroslav Krejčí. Minister of Agriculture and Forestry Adolf Hrubý and Minister of Education and People´s Enlightenment and Chairman of the Board Emanuel Moravec spoke to the participants in the chateau courtyard. In the afternoon, the course participants put on a collective sports performance in the park adjoining the chateau.
Segment from Český zvukový týdeník Aktualita (Czech Aktualita Sound Newsreel) issue no. 39A from 1944 was shot during a training course for the regional leaders of the Board of Trustees for the Education of Youth, which was held at the Čeperka Guest House near Unhošť in connection with changes in the conditions of mandatory youth service resulting from the declaration of forced labour (Totaleinsatz) in August 1944. In addition to lectures, the programme included sports activities to improve the physical fitness of the participants.
En-De translation models, exported via TensorFlow Serving, available in the Lindat translation service (https://lindat.mff.cuni.cz/services/translation/).
Models are compatible with Tensor2tensor version 1.6.6.
For details about the model training (data, model hyper-parameters), please contact the archive maintainer.
Evaluation on newstest2020 (BLEU):
en->de: 25.9
de->en: 33.4
(Evaluated using multeval: https://github.com/jhclark/multeval)
En-Ru translation models, exported via TensorFlow Serving, available in the Lindat translation service (https://lindat.mff.cuni.cz/services/translation/).
Models are compatible with Tensor2tensor version 1.6.6.
For details about the model training (data, model hyper-parameters), please contact the archive maintainer.
Evaluation on newstest2020 (BLEU):
en->ru: 18.0
ru->en: 30.4
(Evaluated using multeval: https://github.com/jhclark/multeval)
Tree Editor
TrEd is a fully customizable and programmable graphical editor and viewer for tree-like structures. Among other projects, it was used as the main annotation tool for syntactical and tectogrammatical annotations in The Prague Dependency Treebank, as well as for decision-tree based morphological annotation of The Prague Arabic Dependency Treebank.