Show simple item record

 
dc.contributor.author Hajič, Jan
dc.contributor.author Mareček, David
dc.contributor.author Fučíková, Eva
dc.contributor.author Cinková, Silvie
dc.contributor.author Štěpánek, Jan
dc.contributor.author Mikulová, Marie
dc.contributor.author Popel, Martin
dc.date.accessioned 2021-10-15T13:57:12Z
dc.date.available 2021-10-15T13:57:12Z
dc.date.issued 2021-09-20
dc.identifier.uri http://hdl.handle.net/11234/1-3775
dc.description This machine translation test set contains 2223 Czech sentences collected within the FAUST project (https://ufal.mff.cuni.cz/grants/faust, http://hdl.handle.net/11234/1-3308). Each original (noisy) sentence was normalized (clean1 and clean2) and translated to English independently by two translators.
dc.language.iso eng
dc.language.iso ces
dc.publisher Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
dc.relation.isbasedon http://hdl.handle.net/11234/1-3308
dc.rights Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
dc.rights.uri http://creativecommons.org/licenses/by-nc-sa/4.0/
dc.subject noisy texts
dc.subject parallel corpus
dc.subject machine translation
dc.title FAUST cs-en 0.5
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
dc.rights.label PUB
has.files yes
branding LINDAT / CLARIAH-CZ
contact.person Martin Popel popel@ufal.mff.cuni.cz Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
sponsor Ministerstvo školství, mládeže a tělovýchovy České republiky LM2018101 LINDAT/CLARIAH-CZ: Digitální výzkumná infrastruktura pro jazykové technologie, umění a humanitní vědy nationalFunds
sponsor Grantová agentura České republiky GX20-16819X LUSyD – Language Understanding: from Syntax to Discourse nationalFunds
size.info 2223 sentences
files.size 917004
files.count 1


 Files in this item

Icon
Name
faust-csen.zip
Size
895.51 KB
Format
application/zip
Description
Neznámý
MD5
ddb9093027913f1883d25dfafc1ecb1a
 Download file  Preview
 File Preview  
  • scripts
    • faust-extract-tmx.pl1 kB
    • faust-merge-tsv.pl1 kB
  • original-tmx
    • faust-csen-rs.tmx1 MB
    • faust-csen-mu.tmx1 MB
    • README.txt979 B
    • faust-csen-noisy-cs.txt160 kB
    • faust-csen-noisy-en.txt338 kB
    • faust-csen-clean2-cs.txt159 kB
    • faust-csen-clean1-cs.txt159 kB

Show simple item record