Skip to search
Skip to main content
Skip to first result
Search
Search Results
Creator:
Dušek, Ondřej , Hajič, Jan , Hlaváčová, Jaroslava , Libovický, Jindřich , Pecina, Pavel , Tamchyna, Aleš , and Urešová, Zdeňka
Publisher:
Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:
text and corpus
Subject:
corpus , test data , medical , health , machine translation , Czech , English , French , German , Hungarian , Polish , Spanish , and Swedish
Language:
Czech , English , French , German , Hungarian , Polish , Spanish , and Swedish
Description:
This package contains data sets for development (Section dev) and testing (Section test) of machine translation of sentences from summaries of medical articles between Czech, English, French, German, Hungarian, Polish, Spanish
and Swedish. Version 2.0 extends the previous version by adding Hungarian, Polish, Spanish, and Swedish translations.
Rights:
Creative Commons - Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) , http://creativecommons.org/licenses/by-nc/4.0/ , and PUB
Creator:
Zeman, Daniel
Publisher:
Charles University, Faculty of Mathematics and Physics
Type:
tool and toolService
Subject:
morphology , part of speech , conversion , and tagset
Language:
Arabic , Bulgarian , Bengali , Catalan , Czech , Danish , German , Modern Greek (1453-) , English , Spanish , Estonian , Basque , Persian , Finnish , Ancient Greek (to 1453) , Hebrew , Hindi , Croatian , Japanese , Multiple languages , and Portuguese
Description:
Lingua::Interset is a universal morphosyntactic feature set to which all tagsets of all corpora/languages can be mapped. Version 2.026 covers 37 different tagsets of 21 languages. Limited support of the older drivers for other languages (which are not included in this package but are available for download elsewhere) is also available; these will be fully ported to Interset 2 in future.
Interset is implemented as Perl libraries. It is also available via CPAN.
Rights:
Artistic License (Perl) 1.0 , http://opensource.org/licenses/Artistic-Perl-1.0 , and PUB
Creator:
Jan Patočka
Publisher:
Str. 89–131. Stať. [Součástí eseje i text To platí též..., v. 1988/25H.]
Type:
Text
Subject:
1975 , 1979/25 , 1981/6 , 1981/7 , 1988/25H , 1988/28 , 1988/31 , 1988/32 , 1988/34 , 1994/7 , 1996/4 , 1996/7 , 1998/3 , 1999/8 , 2 , 2001/9 , 2002/21 , 2002/7 , 2006/1 , 2007/1 , 2008/3 , bg , cs , de , en , es , fr , fulltext , hu , it , jp , lt , no , pl , ru , SS-3/PD-III , sv , uk , and v
Language:
Czech , English , Bulgarian , French , Italian , Lithuanian , Hungarian , German , Norwegian , Polish , Russian , Spanish , Swedish , and Ukrainian
Rights:
open access and Rights holder: Archiv Jana Patočky, z.s.
Creator:
Jan Patočka
Publisher:
Str. 251–305. Mono/Náčrt. — 2. otisk in: Péče o duši I (SS-1/PD-I), Praha 1996, str. 243–302 (v. 1996/2).
Type:
Text
Subject:
1987 , 1990/6 , 1996/2 , 2007/7 , cs , es , fr , fulltext , and SS-1/PD-I
Language:
French , Spanish , and Czech
Rights:
open access and Rights holder: Archiv Jana Patočky, z.s.
Creator:
Straková, Jana and Straka, Milan
Publisher:
Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:
text , mlmodel , and languageDescription
Subject:
named entity recognition
Language:
English , German , Dutch , Spanish , and Czech
Description:
NER models for NameTag 2, named entity recognition tool, for English, German, Dutch, Spanish and Czech. Model documentation including performance can be found here: https://ufal.mff.cuni.cz/nametag/2/models . These models are for NameTag 2, named entity recognition tool, which can be found here: https://ufal.mff.cuni.cz/nametag/2 .
Rights:
Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) , http://creativecommons.org/licenses/by-nc-sa/4.0/ , and PUB
Creator:
Straková, Jana and Straka, Milan
Publisher:
Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:
text , mlmodel , and languageDescription
Subject:
named entity recognition and NER
Language:
English , German , Dutch , Spanish , and Czech
Description:
NER models for NameTag 2, named entity recognition tool, for English, German, Dutch, Spanish and Czech. Model documentation including performance can be found here: https://ufal.mff.cuni.cz/nametag/2/models . These models are for NameTag 2, named entity recognition tool, which can be found here: https://ufal.mff.cuni.cz/nametag/2 .
Rights:
Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) , http://creativecommons.org/licenses/by-nc-sa/4.0/ , and PUB
Creator:
Jan Patočka
Publisher:
Z něm. rkp. přel. I. Chvatík a I. Santar, Praha (samizdat) 1979, 15 s. Předn. [Překládáno z autorského strojopisu (první verze) přednášky proslovené na XV. světovém filosofickém kongresu ve Varně, 1973. Přednáška nebyla otištěna v kongresovém sborníku.]
Type:
Text
Subject:
1979 , 1979/14 , 1979/30 , 1988/12 , 1989/16 , 1990/6 , 1991/3 , 2002/1 , 2004/10 , 2007/17 , cs , en , es , fr , fulltext , jp , SS-3/PD-3 , and SS-3/PD-III
Language:
Czech , English , French , and Spanish
Description:
1. verze
Rights:
open access and Rights holder: Archiv Jana Patočky, z.s.
Creator:
Jan Patočka
Publisher:
Z něm. rkp. přel. I. Chvatík a I. Santar, Praha (samizdat) 1979, 15 s. Předn. [Překládáno z autorského strojopisu (první verze) přednášky proslovené na XV. světovém filosofickém kongresu ve Varně, 1973. Přednáška nebyla otištěna v kongresovém sborníku.] — 2. otisk in: Svědectví 16 (Paris 1980), seš. 62, str. 262–272. — 3. otisk in: Péče o duši III (SS-3/PD-III), Praha 2002, str. 147–160 (v. 2002/1). — Srv. druhou, doplněnou verzi Die Gefahren der Technisierung in der Wissenschaft bei Edmund Husserl und das Wesen der Technik als Gefahr bei Martin Heidegger, v. 1991/3. — Dále srv. 1979/14, 1979/30 a 1988/12.
Type:
Text
Subject:
1979 , 1979/14 , 1979/30 , 1988/12 , 1989/16 , 1990/6 , 1991/3 , 2002/1 , 2004/10 , 2007/17 , cs , en , es , fr , fulltext , jp , samizdat , and SS-3/PD-III
Language:
English , French , Spanish , and Czech
Description:
2. verze
Rights:
open access and Rights holder: Archiv Jana Patočky, z.s.
Creator:
Jan Patočka
Publisher:
Str. 9–44. Stať. [Psáno r. 1951–1953, původně snad zamýšleno autorem jako jeho příspěvek do oslavného sborníku k 70. narozeninám F. Novotného (1951), protože se práce rozrostla a autor text dokončil včas, kolovala jako samostatná strojopisná kopie.] — 2. otisk in: Proměny 24 (New York 1987), č. 1, str. 108–135. — 3. otisk in: Negativní platonismus, 1. knižní vyd., Praha 1990, str. 9–58 (v. 1990/2). — 4. otisk (2. knižní, opr. vyd.) in: Péče o duši I (SS-1/PD-I), Praha 1996, str. 303–336 (v. 1996/2). — 5. otisk: Negativní platonismus, 3. knižní, opr. vyd., Praha 2007, 71 s. (v. 2007/9). — Částečný otisk úryvku ze začátku V. kapitoly (v. 4. otisk, str. 327–330) pod názvem IDEA a CHÓRISMOS, in: Idea, hypotéza a otázka, ed. P. Rezek, Praha (OIKOYMENH) 1991, str. 51–53, Edice PomFil, sv. 1.
Type:
Text
Subject:
1987 , 1988/28 , 1989/16 , 1990/2 , 1990/6 , 1996/2 , 1996/7 , 1996/8 , 2001/9 , 2007/7 , 2007/9 , ca , cs , de , en , es , fr , fulltext , hu , ru , SS-1/PD-I , stať , and uk
Language:
English , French , Catalan , Hungarian , German , Spanish , Russian , Ukrainian , and Czech
Rights:
open access and Rights holder: Archiv Jana Patočky, z.s.
Creator:
Koehn, Philipp , Heafield, Kenneth , Forcada, Mikel L. , Esplà-Gomis, Miquel , Ortiz-Rojas, Sergio , Sánchez, Gema Ramírez , Cartagena, Víctor M. Sánchez , Haddow, Barry , Bañón, Marta , Střelec, Marek , Samiotou, Anna , and Kamran, Amir
Publisher:
ParaCrawl
Type:
text and corpus
Subject:
ParaCrawl , parallel corpus , CommonCrawl , machine translation , and text corpora
Language:
English , German , French , Spanish , Italian , Portuguese , Dutch , Polish , Czech , Romanian , Finnish , Latvian , Russian , and Estonian
Description:
The January 2018 release of the ParaCrawl is the first version of the corpus. It contains parallel corpora for 11 languages paired with English, crawled from a large number of web sites. The selection of websites is based on CommonCrawl, but ParaCrawl is extracted from a brand new crawl which has much higher coverage of these selected websites than CommonCrawl. Since the data is fairly raw, it is released with two quality metrics that can be used for corpus filtering. An official "clean" version of each corpus uses one of the metrics. For more details and raw data download please visit: http://paracrawl.eu/releases.html
Rights:
Public Domain Dedication (CC Zero) , http://creativecommons.org/publicdomain/zero/1.0/ , and PUB