dc.contributor.author | Hajič, Jan |
dc.contributor.author | Mareček, David |
dc.contributor.author | Fučíková, Eva |
dc.contributor.author | Cinková, Silvie |
dc.contributor.author | Štěpánek, Jan |
dc.contributor.author | Mikulová, Marie |
dc.date.accessioned | 2021-10-15T13:55:25Z |
dc.date.available | 2021-10-15T13:55:25Z |
dc.date.issued | 2011-02-01 |
dc.identifier.uri | http://hdl.handle.net/11234/1-3308 |
dc.description | Syntactic (including deep-syntactic - tectogrammatical) annotation of user-generated noisy sentences. The annotation was made on Czech-English and English-Czech Faust Dev/Test sets. The English data includes manual annotations of English reference translations of Czech source texts. This texts were translated independently by two translators. After some necessary cleanings, 1000 segments were randomly selected for manual annotation. Both the reference translations were annotated, which means 2000 annotated segments in total. The Czech data includes manual annotations of Czech reference translations of English source texts. This texts were translated independently by three translators. After some necessary cleanings, 1000 segments were randomly selected for manual annotation. All three reference translations were annotated, which means 3000 annotated segments in total. Faust is part of PDT-C 1.0 (http://hdl.handle.net/11234/1-3185). |
dc.language.iso | eng |
dc.language.iso | ces |
dc.publisher | Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL) |
dc.relation | info:eu-repo/grantAgreement/EC/FP7/247762 |
dc.relation.ispartof | http://hdl.handle.net/11234/1-3185 |
dc.relation.isreferencedby | https://arxiv.org/abs/2006.03679 |
dc.rights | Creative Commons - Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) |
dc.rights.uri | http://creativecommons.org/licenses/by-nc/4.0/ |
dc.source.uri | https://ufal.mff.cuni.cz/grants/faust |
dc.subject | tectogrammatics |
dc.subject | treebank |
dc.subject | parallel corpus |
dc.subject | noisy texts |
dc.title | FAUST 0.5 |
dc.type | corpus |
metashare.ResourceInfo#ContentInfo.mediaType | text |
dc.rights.label | PUB |
has.files | yes |
branding | LINDAT / CLARIAH-CZ |
contact.person | Jan Hajič hajic@ufal.mff.cuni.cz Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL) |
sponsor | European Union FP7-ICT-2009-4-247762 Faust euFunds info:eu-repo/grantAgreement/EC/FP7/247762 |
sponsor | Ministerstvo školství, mládeže a tělovýchovy České republiky LM2015071 LINDAT/CLARIN: Institut pro analýzu, zpracování a distribuci lingvistických dat nationalFunds |
sponsor | Ministerstvo školství, mládeže a tělovýchovy České republiky LM2010013 LINDAT/CLARIN: Institut pro analýzu, zpracování a distribuci lingvistických dat nationalFunds |
size.info | 5000 sentences |
files.size | 13504547 |
files.count | 1 |
Soubory tohoto záznamu
Licenční kategorie:
Licence: Creative Commons - Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
Publicly Available
Licence: Creative Commons - Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)
- Název
- release_2011-02-01.zip
- Velikost
- 12.88 MB
- Formát
- application/zip
- Popis
- Unknown
- MD5
- d1e6279faa8987b10e983881134945fd
- release_2011-02-01
- README.txt3 kB
- LICENSE.txt140 B
- eng_to_cze
- sources
- selected_id.txt4 kB
- jh_eng-cze_all_texts.txt288 kB
- valid_jh_all.txt10 kB
- es_eng-cze_all_texts.txt270 kB
- es_selected.zip44 kB
- faust.test-1.eng-cze.mu.final.tmx894 kB
- mu_eng-cze_all_texts.txt285 kB
- mu_selected.zip43 kB
- faust.test-1.eng-cze.es.final.tmx846 kB
- valid_es_all.txt9 kB
- valid_mu_all.txt10 kB
- faust.test-1.eng-cze.jh.final.tmx865 kB
- jh_selected.zip44 kB
- pml_schemas
- tdata_faust_schema.xml466 B
- tanot_schema.xml1 kB
- adata_tmt_schema.xml12 kB
- data
- dev
- mu
- faust_2010_07_mu_10.a.gz32 kB
- faust_2010_07_mu_01.t.gz16 kB
- faust_2010_07_mu_09.t.gz20 kB
- faust_2010_07_mu_08.a.gz24 kB
- faust_2010_07_mu_02.t.gz22 kB
- faust_2010_07_mu_01.a.gz23 kB
- faust_2010_07_mu_09.a.gz31 kB
- faust_2010_07_mu_03.t.gz17 kB
- faust_2010_07_mu_02.a.gz31 kB
- faust_2010_07_mu_04.t.gz24 kB
- faust_2010_07_mu_03.a.gz24 kB
- faust_2010_07_mu_05.t.gz21 kB
- faust_2010_07_mu_04.a.gz37 kB
- faust_2010_07_mu_06.t.gz23 kB
- faust_2010_07_mu_05.a.gz32 kB
- faust_2010_07_mu_07.t.gz22 kB
- faust_2010_07_mu_06.a.gz35 kB
- faust_2010_07_mu_10.t.gz22 kB
- faust_2010_07_mu_08.t.gz17 kB
- faust_2010_07_mu_07.a.gz29 kB
- jh
- faust_2010_07_jh_03.a.gz24 kB
- faust_2010_07_jh_05.t.gz22 kB
- faust_2010_07_jh_04.a.gz36 kB
- faust_2010_07_jh_06.t.gz24 kB
- faust_2010_07_jh_05.a.gz32 kB
- faust_2010_07_jh_07.t.gz22 kB
- faust_2010_07_jh_06.a.gz35 kB
- faust_2010_07_jh_10.t.gz24 kB
- faust_2010_07_jh_08.t.gz18 kB
- faust_2010_07_jh_07.a.gz30 kB
- faust_2010_07_jh_10.a.gz34 kB
- faust_2010_07_jh_01.t.gz18 kB
- faust_2010_07_jh_09.t.gz21 kB
- faust_2010_07_jh_08.a.gz25 kB
- faust_2010_07_jh_02.t.gz23 kB
- faust_2010_07_jh_01.a.gz25 kB
- faust_2010_07_jh_09.a.gz30 kB
- faust_2010_07_jh_03.t.gz17 kB
- faust_2010_07_jh_02.a.gz32 kB
- faust_2010_07_jh_04.t.gz24 kB
- es
- faust_2010_07_es_01.a.gz25 kB
- faust_2010_07_es_09.a.gz30 kB
- faust_2010_07_es_03.t.gz16 kB
- faust_2010_07_es_02.a.gz32 kB
- faust_2010_07_es_04.t.gz26 kB
- faust_2010_07_es_03.a.gz23 kB
- faust_2010_07_es_05.t.gz21 kB
- faust_2010_07_es_04.a.gz39 kB
- faust_2010_07_es_06.t.gz24 kB
- faust_2010_07_es_05.a.gz30 kB
- faust_2010_07_es_07.t.gz22 kB
- faust_2010_07_es_06.a.gz34 kB
- faust_2010_07_es_10.t.gz24 kB
- faust_2010_07_es_08.t.gz18 kB
- faust_2010_07_es_07.a.gz29 kB
- faust_2010_07_es_10.a.gz33 kB
- faust_2010_07_es_01.t.gz19 kB
- faust_2010_07_es_09.t.gz21 kB
- faust_2010_07_es_08.a.gz25 kB
- faust_2010_07_es_02.t.gz25 kB
- mu
- test
- mu
- faust_2010_07_mu_20.a.gz43 kB
- faust_2010_07_mu_11.t.gz15 kB
- faust_2010_07_mu_19.t.gz24 kB
- faust_2010_07_mu_18.a.gz35 kB
- faust_2010_07_mu_12.t.gz19 kB
- faust_2010_07_mu_11.a.gz22 kB
- faust_2010_07_mu_19.a.gz31 kB
- faust_2010_07_mu_13.t.gz15 kB
- faust_2010_07_mu_12.a.gz28 kB
- faust_2010_07_mu_14.t.gz19 kB
- faust_2010_07_mu_13.a.gz21 kB
- faust_2010_07_mu_15.t.gz15 kB
- faust_2010_07_mu_14.a.gz27 kB
- faust_2010_07_mu_16.t.gz18 kB
- faust_2010_07_mu_15.a.gz19 kB
- faust_2010_07_mu_17.t.gz24 kB
- faust_2010_07_mu_16.a.gz23 kB
- faust_2010_07_mu_20.t.gz30 kB
- faust_2010_07_mu_18.t.gz23 kB
- faust_2010_07_mu_17.a.gz35 kB
- jh
- faust_2010_07_jh_15.t.gz15 kB
- faust_2010_07_jh_14.a.gz28 kB
- faust_2010_07_jh_16.t.gz17 kB
- faust_2010_07_jh_15.a.gz19 kB
- faust_2010_07_jh_17.t.gz25 kB
- faust_2010_07_jh_16.a.gz22 kB
- faust_2010_07_jh_18.t.gz22 kB
- faust_2010_07_jh_17.a.gz36 kB
- faust_2010_07_jh_11.t.gz15 kB
- faust_2010_07_jh_19.t.gz24 kB
- faust_2010_07_jh_18.a.gz33 kB
- faust_2010_07_jh_12.t.gz19 kB
- faust_2010_07_jh_11.a.gz20 kB
- faust_2010_07_jh_19.a.gz35 kB
- faust_2010_07_jh_13.t.gz15 kB
- faust_2010_07_jh_20.t.gz31 kB
- faust_2010_07_jh_12.a.gz28 kB
- faust_2010_07_jh_14.t.gz20 kB
- faust_2010_07_jh_13.a.gz21 kB
- faust_2010_07_jh_20.a.gz41 kB
- es
- faust_2010_07_es_13.t.gz15 kB
- faust_2010_07_es_20.t.gz31 kB
- faust_2010_07_es_12.a.gz28 kB
- faust_2010_07_es_14.t.gz20 kB
- faust_2010_07_es_13.a.gz20 kB
- faust_2010_07_es_20.a.gz42 kB
- faust_2010_07_es_15.t.gz15 kB
- faust_2010_07_es_14.a.gz27 kB
- faust_2010_07_es_16.t.gz17 kB
- faust_2010_07_es_15.a.gz19 kB
- faust_2010_07_es_17.t.gz24 kB
- faust_2010_07_es_16.a.gz21 kB
- faust_2010_07_es_18.t.gz24 kB
- faust_2010_07_es_17.a.gz34 kB
- faust_2010_07_es_11.t.gz16 kB
- faust_2010_07_es_19.t.gz27 kB
- faust_2010_07_es_18.a.gz36 kB
- faust_2010_07_es_12.t.gz19 kB
- faust_2010_07_es_11.a.gz21 kB
- faust_2010_07_es_19.a.gz35 kB
- mu
- dev
- sources
- documentation.pdf216 kB
- cze_to_eng
- sources
- selected_id.txt4 kB
- data_rs_e.zip1 MB
- valid_rs_all.txt9 kB
- data_mu_e.zip1 MB
- mu_cze_verze01-final.tmx1 MB
- mu_selected.zip48 kB
- mu_cze-eng_all_texts.txt364 kB
- rs-faust-cze-eng2-FinalOutput.tmx1 MB
- rs_selected.zip52 kB
- valid_mu_all.txt10 kB
- rs_cze-eng_all_texts.txt403 kB
- data
- dev
- mu
- faust_2010_10_mu_e_06.t.gz30 kB
- faust_2010_10_mu_e_03.p.gz25 kB
- faust_2010_10_mu_e_05.a.gz25 kB
- faust_2010_10_mu_e_07.t.gz25 kB
- faust_2010_10_mu_e_04.p.gz34 kB
- faust_2010_10_mu_e_06.a.gz37 kB
- faust_2010_10_mu_e_10.t.gz33 kB
- faust_2010_10_mu_e_08.t.gz30 kB
- faust_2010_10_mu_e_05.p.gz18 kB
- faust_2010_10_mu_e_07.a.gz32 kB
- faust_2010_10_mu_e_10.a.gz41 kB
- faust_2010_10_mu_e_01.t.gz33 kB
- faust_2010_10_mu_e_09.t.gz39 kB
- faust_2010_10_mu_e_06.p.gz26 kB
- faust_2010_10_mu_e_08.a.gz36 kB
- faust_2010_10_mu_e_02.t.gz22 kB
- faust_2010_10_mu_e_07.p.gz22 kB
- faust_2010_10_mu_e_01.a.gz46 kB
- faust_2010_10_mu_e_09.a.gz48 kB
- faust_2010_10_mu_e_10.p.gz30 kB
- faust_2010_10_mu_e_03.t.gz29 kB
- faust_2010_10_mu_e_08.p.gz27 kB
- faust_2010_10_mu_e_02.a.gz28 kB
- faust_2010_10_mu_e_04.t.gz34 kB
- faust_2010_10_mu_e_01.p.gz31 kB
- faust_2010_10_mu_e_09.p.gz37 kB
- faust_2010_10_mu_e_03.a.gz34 kB
- faust_2010_10_mu_e_05.t.gz21 kB
- faust_2010_10_mu_e_02.p.gz20 kB
- faust_2010_10_mu_e_04.a.gz44 kB
- rs
- faust_2010_10_rs_e_07.p.gz24 kB
- faust_2010_10_rs_e_01.a.gz49 kB
- faust_2010_10_rs_e_09.a.gz51 kB
- faust_2010_10_rs_e_10.p.gz33 kB
- faust_2010_10_rs_e_03.t.gz29 kB
- faust_2010_10_rs_e_08.p.gz29 kB
- faust_2010_10_rs_e_02.a.gz31 kB
- faust_2010_10_rs_e_04.t.gz36 kB
- faust_2010_10_rs_e_01.p.gz34 kB
- faust_2010_10_rs_e_09.p.gz39 kB
- faust_2010_10_rs_e_03.a.gz38 kB
- faust_2010_10_rs_e_05.t.gz21 kB
- faust_2010_10_rs_e_02.p.gz23 kB
- faust_2010_10_rs_e_04.a.gz47 kB
- faust_2010_10_rs_e_06.t.gz32 kB
- faust_2010_10_rs_e_03.p.gz28 kB
- faust_2010_10_rs_e_05.a.gz27 kB
- faust_2010_10_rs_e_07.t.gz25 kB
- faust_2010_10_rs_e_04.p.gz36 kB
- faust_2010_10_rs_e_06.a.gz40 kB
- faust_2010_10_rs_e_10.t.gz34 kB
- faust_2010_10_rs_e_08.t.gz30 kB
- faust_2010_10_rs_e_05.p.gz20 kB
- faust_2010_10_rs_e_07.a.gz33 kB
- faust_2010_10_rs_e_10.a.gz44 kB
- faust_2010_10_rs_e_01.t.gz37 kB
- faust_2010_10_rs_e_09.t.gz40 kB
- faust_2010_10_rs_e_06.p.gz29 kB
- faust_2010_10_rs_e_08.a.gz39 kB
- faust_2010_10_rs_e_02.t.gz25 kB
- mu
- test
- mu
- faust_2010_10_mu_e_15.a.gz30 kB
- faust_2010_10_mu_e_17.t.gz28 kB
- faust_2010_10_mu_e_14.p.gz25 kB
- faust_2010_10_mu_e_16.a.gz47 kB
- faust_2010_10_mu_e_20.t.gz24 kB
- faust_2010_10_mu_e_18.t.gz34 kB
- faust_2010_10_mu_e_15.p.gz22 kB
- faust_2010_10_mu_e_17.a.gz33 kB
- faust_2010_10_mu_e_20.a.gz29 kB
- faust_2010_10_mu_e_11.t.gz28 kB
- faust_2010_10_mu_e_19.t.gz40 kB
- faust_2010_10_mu_e_16.p.gz36 kB
- faust_2010_10_mu_e_18.a.gz43 kB
- faust_2010_10_mu_e_12.t.gz27 kB
- faust_2010_10_mu_e_17.p.gz25 kB
- faust_2010_10_mu_e_11.a.gz36 kB
- faust_2010_10_mu_e_19.a.gz44 kB
- faust_2010_10_mu_e_20.p.gz20 kB
- faust_2010_10_mu_e_13.t.gz28 kB
- faust_2010_10_mu_e_18.p.gz32 kB
- faust_2010_10_mu_e_12.a.gz35 kB
- faust_2010_10_mu_e_14.t.gz29 kB
- faust_2010_10_mu_e_11.p.gz26 kB
- faust_2010_10_mu_e_19.p.gz31 kB
- faust_2010_10_mu_e_13.a.gz34 kB
- faust_2010_10_mu_e_15.t.gz26 kB
- faust_2010_10_mu_e_12.p.gz26 kB
- faust_2010_10_mu_e_14.a.gz36 kB
- faust_2010_10_mu_e_16.t.gz37 kB
- faust_2010_10_mu_e_13.p.gz25 kB
- rs
- faust_2010_10_rs_e_20.p.gz21 kB
- faust_2010_10_rs_e_13.t.gz29 kB
- faust_2010_10_rs_e_18.p.gz36 kB
- faust_2010_10_rs_e_12.a.gz35 kB
- faust_2010_10_rs_e_14.t.gz27 kB
- faust_2010_10_rs_e_11.p.gz31 kB
- faust_2010_10_rs_e_19.p.gz35 kB
- faust_2010_10_rs_e_13.a.gz39 kB
- faust_2010_10_rs_e_15.t.gz26 kB
- faust_2010_10_rs_e_12.p.gz27 kB
- faust_2010_10_rs_e_14.a.gz35 kB
- faust_2010_10_rs_e_16.t.gz39 kB
- faust_2010_10_rs_e_13.p.gz29 kB
- faust_2010_10_rs_e_15.a.gz32 kB
- faust_2010_10_rs_e_17.t.gz28 kB
- faust_2010_10_rs_e_14.p.gz25 kB
- faust_2010_10_rs_e_16.a.gz51 kB
- faust_2010_10_rs_e_20.t.gz24 kB
- faust_2010_10_rs_e_18.t.gz36 kB
- faust_2010_10_rs_e_15.p.gz23 kB
- faust_2010_10_rs_e_17.a.gz36 kB
- faust_2010_10_rs_e_20.a.gz29 kB
- faust_2010_10_rs_e_11.t.gz31 kB
- faust_2010_10_rs_e_19.t.gz39 kB
- faust_2010_10_rs_e_16.p.gz38 kB
- faust_2010_10_rs_e_18.a.gz47 kB
- faust_2010_10_rs_e_12.t.gz28 kB
- faust_2010_10_rs_e_17.p.gz27 kB
- faust_2010_10_rs_e_11.a.gz42 kB
- faust_2010_10_rs_e_19.a.gz47 kB
- mu
- dev
- sources