Number of results to display per page
Search Results
842. Gynaecologist Jerie at the Hospital in Podolí
- Creator:
- Veselý, Bohumil
- Publisher:
- Národní filmový archiv
- Type:
- video and clip
- Subject:
- gynekologie, sestry zdravotní, přístroj anesteziologický, lékař gynekolog, zdravotní sestry výuka, plyn rajský, lůžko gynekologické, anesteziologie, Galerie osobností, Places::Praha::Podolí::porodnice, People::Jerie Jan (1897-1964), and Zdravotní a sociální péče
- Language:
- No linguistic content
- Description:
- The segment shows gynaecologist Jan Jerie and nurses standing by a new anaesthesiologic device at the Prague Maternity Hospital in Podolí.
- Rights:
- Creative Commons - Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0), http://creativecommons.org/licenses/by-nc-nd/4.0/, and PUB
843. HaCzech: Dataset of Handwritten Czech
- Creator:
- Procházka, Štěpán and Straka, Milan
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- image and corpus
- Subject:
- htr, ocr, manuscripts, chronicles, and handwriting
- Language:
- Czech
- Description:
- The dataset of handwritten Czech text lines, sourced from two chronicles (municipal chronicles 1931-1944, school chronicles 1913-1933). The dataset comprises 25k lines machine-extracted from scanned pages, and provides manual annotation of text contents for a subset of size 2k.
- Rights:
- Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB
844. HamleDT 2.0
- Creator:
- Zeman, Daniel, Mareček, David, Mašek, Jan, Popel, Martin, Ramasamy, Loganathan, Rosa, Rudolf, Štěpánek, Jan, and Žabokrtský, Zdeněk
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- treebank, Stanford dependencies, Prague dependencies, harmonization, common annotation style, and Interset
- Language:
- Arabic, Bulgarian, Bengali, Catalan, Czech, Danish, German, Modern Greek (1453-), English, Spanish, Estonian, Basque, Persian, Finnish, Ancient Greek (to 1453), Hindi, Hungarian, Italian, Japanese, Latin, Dutch, Portuguese, Romanian, Russian, Slovak, Slovenian, Swedish, Tamil, Telugu, and Turkish
- Description:
- HamleDT 2.0 is a collection of 30 existing treebanks harmonized into a common annotation style, the Prague Dependencies, and further transformed into Stanford Dependencies, a treebank annotation style that became popular recently. We use the newest basic Universal Stanford Dependencies, without added language-specific subtypes.
- Rights:
- HamleDT 2.0 Licence Agreement, https://lindat.mff.cuni.cz/repository/xmlui/page/licence-hamledt-2.0, and ACA
845. HamleDT 3.0
- Creator:
- Zeman, Daniel, Mareček, David, Mašek, Jan, Popel, Martin, Ramasamy, Loganathan, Rosa, Rudolf, Štěpánek, Jan, and Žabokrtský, Zdeněk
- Publisher:
- Charles University
- Type:
- text and corpus
- Subject:
- annotated corpus, morphology, syntax, dependency, treebank, harmonized annotation, and common annotation style
- Language:
- Arabic, Basque, Bengali, Bulgarian, Catalan, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Modern Greek (1453-), Ancient Greek (to 1453), Hebrew, Hindi, Hungarian, Indonesian, Irish, Italian, Japanese, Latin, Persian, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Spanish, Swedish, Tamil, Telugu, and Turkish
- Description:
- HamleDT (HArmonized Multi-LanguagE Dependency Treebank) is a compilation of existing dependency treebanks (or dependency conversions of other treebanks), transformed so that they all conform to the same annotation style. This version uses Universal Dependencies as the common annotation style. Update (November 1017): for a current collection of harmonized dependency treebanks, we recommend using the Universal Dependencies (UD). All of the corpora that are distributed in HamleDT in full are also part of the UD project; only some corpora from the Patch group (where HamleDT provides only the harmonizing scripts but not the full corpus data) are available in HamleDT but not in UD.
- Rights:
- HamleDT 3.0 License Terms, https://lindat.mff.cuni.cz/repository/xmlui/page/licence-hamledt-3.0, and PUB
846. Hana Cavallarová (opera singer)
- Creator:
- Veselý, Bohumil
- Publisher:
- Národní filmový archiv
- Type:
- video and clip
- Subject:
- zahrada vily, konev, lavička, Galerie osobností, Places::Řevnice::zahrada vily Johanny Weissové-Cavalarové, and People::Weissová-Cavalarová Johanna (1863-1946)
- Language:
- No linguistic content
- Description:
- Opera singer Hana Cavallarová with friends in the garden of a villa in Řevnice.
- Rights:
- http://creativecommons.org/licenses/by-nc-nd/4.0/, PUB, and Creative Commons - Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
847. Hana Vítová (actress)
- Creator:
- Aktualita and Veselý, Bohumil
- Publisher:
- Národní filmový archiv
- Type:
- video and clip
- Subject:
- film Valentin Dobrotivý ukázka, Galerie osobností, People::Vítová Hana (1914-1987), People::Nový Oldřich (1899-1983), and Český zvukový týdeník Aktualita::1942/49
- Language:
- German and Czech
- Description:
- Actress Hana Vítová in an unidentified German film (sound). Vítová with actor Oldřich Nový in Valentin Dobrotivý (Valentin the Good, dir. Martin Frič, 1942). Vítová with her husband, critic Bedřich Rádl, in a segment from Československý zvukový týdeník Aktualita (Czechoslovak Aktualita Sound Newsreel) 1942, issue no. 49.
- Rights:
- http://creativecommons.org/licenses/by-nc-nd/4.0/, PUB, and Creative Commons - Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
848. Hanuš Folkman (architect, sculptor)
- Creator:
- Veselý, Bohumil
- Publisher:
- Národní filmový archiv
- Type:
- video and clip
- Subject:
- Galerie osobností, Places::Praha::Nové Město::Školská::pavlač domu, and People::Folkman Hanuš (1876-1936)
- Language:
- No linguistic content
- Description:
- Architect and sculptor Hanuš Folkman in footage from a street and a municipal park.
- Rights:
- http://creativecommons.org/licenses/by-nc-nd/4.0/, PUB, and Creative Commons - Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
849. Hanuš Thein (opera singer)
- Creator:
- Veselý, Bohumil
- Publisher:
- Národní filmový archiv
- Type:
- video and clip
- Subject:
- Galerie osobností, Places::Praha::Nové Město::Školská::pavlač domu, and People::Thein Hanuš (1904-1974)
- Language:
- No linguistic content
- Description:
- Opera singer Hanuš Thein on Bohumil Veselý's balcony.
- Rights:
- http://creativecommons.org/licenses/by-nc-nd/4.0/, PUB, and Creative Commons - Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
850. Hausa Visual Genome 1.0
- Creator:
- Abdulmumin, Idris, Das, Satya Ranja, Dawud, Musa Abdullahi, Parida, Shantipriya, Muhammad, Shamsuddeen Hassan, Ahmad, Ibrahim Sa'id, Panda, Subhadarshi, Bojar, Ondřej, Galadanci, Bashir Shehu, and Bello, Bello Shehu
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- image and corpus
- Subject:
- multi-modal, machine translation, image captioning, image annotation, and neural machine translation
- Language:
- Hausa and English
- Description:
- Data ------- Hausa Visual Genome 1.0, a multimodal dataset consisting of text and images suitable for English-to-Hausa multimodal machine translation tasks and multimodal research. We follow the same selection of short English segments (captions) and the associated images from Visual Genome as the dataset Hindi Visual Genome 1.1 has. We automatically translated the English captions to Hausa and manually post-edited, taking the associated images into account. The training set contains 29K segments. Further 1K and 1.6K segments are provided in development and test sets, respectively, which follow the same (random) sampling from the original Hindi Visual Genome. Additionally, a challenge test set of 1400 segments is available for the multi-modal task. This challenge test set was created in Hindi Visual Genome by searching for (particularly) ambiguous English words based on the embedding similarity and manually selecting those where the image helps to resolve the ambiguity. Dataset Formats ----------------------- The multimodal dataset contains both text and images. The text parts of the dataset (train and test sets) are in simple tab-delimited plain text files. All the text files have seven columns as follows: Column1 - image_id Column2 - X Column3 - Y Column4 - Width Column5 - Height Column6 - English Text Column7 - Hausa Text The image part contains the full images with the corresponding image_id as the file name. The X, Y, Width, and Height columns indicate the rectangular region in the image described by the caption. Data Statistics -------------------- The statistics of the current release are given below. Parallel Corpus Statistics ----------------------------------- Dataset Segments English Words Hausa Words ---------- -------- ------------- ----------- Train 28930 143106 140981 Dev 998 4922 4857 Test 1595 7853 7736 Challenge Test 1400 8186 8752 ---------- -------- ------------- ----------- Total 32923 164067 162326 The word counts are approximate, prior to tokenization. Citation ----------- If you use this corpus, please cite the following paper: @InProceedings{abdulmumin-EtAl:2022:LREC, author = {Abdulmumin, Idris and Dash, Satya Ranjan and Dawud, Musa Abdullahi and Parida, Shantipriya and Muhammad, Shamsuddeen and Ahmad, Ibrahim Sa'id and Panda, Subhadarshi and Bojar, Ond{\v{r}}ej and Galadanci, Bashir Shehu and Bello, Bello Shehu}, title = "{Hausa Visual Genome: A Dataset for Multi-Modal English to Hausa Machine Translation}", booktitle = {Proceedings of the Language Resources and Evaluation Conference}, month = {June}, year = {2022}, address = {Marseille, France}, publisher = {European Language Resources Association}, pages = {6471--6479}, url = {https://aclanthology.org/2022.lrec-1.694} }
- Rights:
- Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB