Number of results to display per page
Search Results
42. Khresmoi Query Translation Test Data 2.0
- Creator:
- Pecina, Pavel, Dušek, Ondřej, Hajič, Jan, Libovický, Jindřich, and Urešová, Zdeňka
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- corpus, test data, medical, health, machine translation, Czech, English, French, German, Hungarian, Polish, Spanish, and Swedish
- Language:
- Czech, English, French, German, Hungarian, Polish, Spanish, and Swedish
- Description:
- This package contains data sets for development and testing of machine translation of medical queries between Czech, English, French, German, Hungarian, Polish, Spanish ans Swedish. The queries come from general public and medical experts. This is version 2.0 extending the previous version by adding Hungarian, Polish, Spanish, and Swedish translations.
- Rights:
- Creative Commons - Attribution-NonCommercial 4.0 International (CC BY-NC 4.0), http://creativecommons.org/licenses/by-nc/4.0/, and PUB
43. Khresmoi Summary Translation Test Data 2.0
- Creator:
- Dušek, Ondřej, Hajič, Jan, Hlaváčová, Jaroslava, Libovický, Jindřich, Pecina, Pavel, Tamchyna, Aleš, and Urešová, Zdeňka
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- corpus, test data, medical, health, machine translation, Czech, English, French, German, Hungarian, Polish, Spanish, and Swedish
- Language:
- Czech, English, French, German, Hungarian, Polish, Spanish, and Swedish
- Description:
- This package contains data sets for development (Section dev) and testing (Section test) of machine translation of sentences from summaries of medical articles between Czech, English, French, German, Hungarian, Polish, Spanish and Swedish. Version 2.0 extends the previous version by adding Hungarian, Polish, Spanish, and Swedish translations.
- Rights:
- Creative Commons - Attribution-NonCommercial 4.0 International (CC BY-NC 4.0), http://creativecommons.org/licenses/by-nc/4.0/, and PUB
44. Mají dějiny smysl?
- Creator:
- Jan Patočka
- Publisher:
- Str. 89–131. Stať. [Součástí eseje i text To platí též..., v. 1988/25H.]
- Type:
- Text
- Subject:
- 1975, 1979/25, 1981/6, 1981/7, 1988/25H, 1988/28, 1988/31, 1988/32, 1988/34, 1994/7, 1996/4, 1996/7, 1998/3, 1999/8, 2, 2001/9, 2002/21, 2002/7, 2006/1, 2007/1, 2008/3, bg, cs, de, en, es, fr, fulltext, hu, it, jp, lt, no, pl, ru, SS-3/PD-III, sv, uk, and v
- Language:
- Czech, English, Bulgarian, French, Italian, Lithuanian, Hungarian, German, Norwegian, Polish, Russian, Spanish, Swedish, and Ukrainian
- Rights:
- open access and Rights holder: Archiv Jana Patočky, z.s.
45. Morpho-syntactically annotated corpora provided for the PARSEME Shared Task on Semi-Supervised Identification of Verbal Multiword Expressions (edition 1.2)
- Creator:
- Guillaume, Bruno, Ramisch, Carlos, Waszczuk, Jakub, Monti, Johanna, Di Buono, Maria Pia, Sangati, Federico, Speranza, Giulia, Carlino, Carola, Güngör, Tunga, Yirmibeşoğlu, Zeynep, Sak, Haşim, Saraçlar, Murat, Giouli, Voula, Foufi, Vassiliki, Ramisch, Renata, Rademaker, Alexandre, Vale, Oto, Wilkens, Rodrigo, Candito, Marie, Crabbé, Benoît, Segonne, Vincent, Liebeskind, Chaya, Stymne, Sara, Hajič, Jan, Ginter, Filip, Luotolahti, Juhani, Straka, Milan, Zeman, Daniel, Barbu Mititelu, Verginica, Cristescu, Mihaela, Vaidya, Ashwini, Bhatia, Archna, Lichte, Timm, Ehren, Rafael, Jiang, Menghan, Xu, Hongzhi, Walsh, Abigail, Irimia, Elena, and Dowling, Meghan
- Publisher:
- PARSEME
- Type:
- text and corpus
- Subject:
- morphosyntactic annotation, dependency trees, and morphological analysis
- Language:
- German, Modern Greek (1453-), Basque, French, Irish, Hebrew, Hindi, Italian, Polish, Portuguese, Romanian, Swedish, Turkish, and Chinese
- Description:
- This multilingual resource contains corpora for 14 languages, gathered at the occasion of the 1.2 edition of the PARSEME Shared Task on semi-supervised Identification of Verbal MWEs (2020). These corpora were meant to serve as additional "raw" corpora, to help discovering unseen verbal MWEs. The corpora are provided in CONLL-U (https://universaldependencies.org/format.html) format. They contain morphosyntactic annotations (parts of speech, lemmas, morphological features, and syntactic dependencies). Depending on the language, the information comes from treebanks (mostly Universal Dependencies v2.x) or from automatic parsers trained on UD v2.x treebanks (e.g., UDPipe). VMWEs include idioms (let the cat out of the bag), light-verb constructions (make a decision), verb-particle constructions (give up), inherently reflexive verbs (help oneself), and multi-verb constructions (make do). For the 1.2 shared task edition, the data covers 14 languages, for which VMWEs were annotated according to the universal guidelines. The corpora are provided in the cupt format, inspired by the CONLL-U format. Morphological and syntactic information – not necessarily using UD tagsets – including parts of speech, lemmas, morphological features and/or syntactic dependencies are also provided. Depending on the language, the information comes from treebanks (e.g., Universal Dependencies) or from automatic parsers trained on treebanks (e.g., UDPipe). This item contains training, development and test data, as well as the evaluation tools used in the PARSEME Shared Task 1.2 (2020). The annotation guidelines are available online: http://parsemefr.lif.univ-mrs.fr/parseme-st-guidelines/1.2
- Rights:
- PARSEME Shared Task Raw Corpus Data (v. 1.2) Agreement, https://lindat.mff.cuni.cz/repository/xmlui/page/licence-mwe-1.2-raw, and PUB
46. Multilingual corpus of literal occurrences of multiword expressions
- Creator:
- Savary, Agata, Cordeiro, Silvio Ricardo, Lichte, Timm, Ramisch, Carlos, Iñurrieta, Uxoa, and Giouli, Voula
- Publisher:
- PARSEME
- Type:
- text and corpus
- Subject:
- verbal multiword expressions, literal occurrence, and idiomaticity rate
- Language:
- Basque, German, Modern Greek (1453-), Polish, and Portuguese
- Description:
- The corpus contains sentences with idiomatic, literal and coincidental occurrences of verbal multiword expressions (VMWEs) in Basque, German, Greek, Polish and Portuguese. The source corpus is the PARSEME multilingual corpus of VMWEs v 1.1 (cf. http://hdl.handle.net/11372/LRT-2842). The sentences with VMWEs were extracted from the source corpus and potential co-occurrences of the same lexemes were automatically extracted from the same corpus. These candidates were then manually annotated by native experts into 6 classes, including literal and coincidental occurrences, as well as various annotation errors. The construction of the corpus is described by the following publication: Agata Savary, Silvio Ricardo Cordeiro, Timm Lichte, Carlos Ramisch, Uxoa Iñurrieta, Voula Giouli (forthcoming) "Literal occurrences of multiword expressions: Rare birds that cause a stir", to appear in Prague Bulletin of Mathematical Linguistics.
- Rights:
- License agreement for The Multilingual corpus of literal occurrences of multiword expressions, https://lindat.mff.cuni.cz/repository/xmlui/page/licence-mwe-literal, and PUB
47. Na křižovatce umění :
- Type:
- text and sborníky jubilejní
- Subject:
- Literatura různých forem a žánrů (o ní), Závodský, Artur,, věda literární, literatura, film a filmy, divadlo, média, slavistika, and české (československé) sborníky a kolektivní monografie
- Language:
- Czech, French, German, Polish, Russian, and Slovak
- Rights:
- unknown
48. Nález kostí lidských v kostele sv. Petra a Pavla v Čáslavi, jež pokládány za pozůstatky Jana Žižky z Trocnova
- Publisher:
- Nákladem Archeologické komise při České akademii císaře Františka Josefa pro vědy, slovesnost a umění a Archeologického Sboru Musea království Českého
- Format:
- print, svazek, and 64 stran, 6 nečíslovaných listů obrazových příloh : ilustrace.
- Type:
- model:monograph and TEXT
- Subject:
- Archeologie, Žižka z Trocnova, Jan, asi 1360-1424, Kostel sv. Petra a Pavla (Čáslav, Kutná Hora, Česko), archeologické výzkumy, antropologie, excavations (archaeology), anthropology, Čáslav (Kutná Hora, Česko), Čáslav (Kutná Hora, Czechia), 902.2, 39+572, (437.312), (0.036.6), 8, and 902
- Language:
- Czech, French, German, and Polish
- Description:
- Obsahuje bibliografické odkazy, Částečně polský, německý a francouzský text, and zvláštní otisk z Památek archaeologických a místopisných
- Rights:
- http://creativecommons.org/publicdomain/mark/1.0/ and policy:public
49. Nicolai Hartmann šedesátiletý (20. 2. 1942)
- Creator:
- Jan Patočka, de: L. Hagedorn, and pl:Artur Mordka
- Publisher:
- Česká mysl 36 (1942), č. 1, str. 43–47. Misc.
- Type:
- Text
- Subject:
- 1942, 36161, cs, de, fulltext, pl, and výročí
- Language:
- Czech, German, and Polish
- Rights:
- open access and Rights holder: Archiv Jana Patočky, z.s.
50. O regionálních dějinách :
- Type:
- text and sborníky konferenční
- Subject:
- Dějiny (obecně), sborníky konferenční, dějiny regionální, metodologie, české (československé) sborníky a kolektivní monografie, and regionální a vlastivědná práce
- Language:
- Czech, German, and Polish
- Rights:
- unknown