Zobrazit minimální záznam

 
dc.contributor.author Ramisch, Carlos
dc.contributor.author Guillaume, Bruno
dc.contributor.author Savary, Agata
dc.contributor.author Waszczuk, Jakub
dc.contributor.author Candito, Marie
dc.contributor.author Vaidya, Ashwini
dc.contributor.author Barbu Mititelu, Verginica
dc.contributor.author Bhatia, Archna
dc.contributor.author Iñurrieta, Uxoa
dc.contributor.author Giouli, Voula
dc.contributor.author Güngör, Tunga
dc.contributor.author Jiang, Menghan
dc.contributor.author Lichte, Timm
dc.contributor.author Liebeskind, Chaya
dc.contributor.author Monti, Johanna
dc.contributor.author Ramisch, Renata
dc.contributor.author Stymme, Sara
dc.contributor.author Walsh, Abigail
dc.contributor.author Xu, Hongzhi
dc.contributor.author Palka-Binkiewicz, Emilia
dc.contributor.author Ehren, Rafael
dc.contributor.author Stymne, Sara
dc.contributor.author Constant, Matthieu
dc.contributor.author Pasquer, Caroline
dc.contributor.author Parmentier, Yannick
dc.contributor.author Antoine, Jean-Yves
dc.contributor.author Carlino, Carola
dc.contributor.author Caruso, Valeria
dc.contributor.author Di Buono, Maria Pia
dc.contributor.author Pascucci, Antonio
dc.contributor.author Raffone, Annalisa
dc.contributor.author Riccio, Anna
dc.contributor.author Sangati, Federico
dc.contributor.author Speranza, Giulia
dc.contributor.author Ramisch, Renata
dc.contributor.author Cordeiro, Silvio Ricardo
dc.contributor.author de Medeiros Caseli, Helena
dc.contributor.author Miranda, Isaac
dc.contributor.author Rademaker, Alexandre
dc.contributor.author Vale, Oto
dc.contributor.author Villavicencio, Aline
dc.contributor.author Wick Pedro, Gabriela
dc.contributor.author Wilkens, Rodrigo
dc.contributor.author Zilio, Leonardo
dc.contributor.author Rizea, Monica-Mihaela
dc.contributor.author Ionescu, Mihaela
dc.contributor.author Onofrei, Mihaela
dc.contributor.author Chen, Jia
dc.contributor.author Ge, Xiaomin
dc.contributor.author Hu, Fangyuan
dc.contributor.author Hu, Sha
dc.contributor.author Li, Minli
dc.contributor.author Liu, Siyuan
dc.contributor.author Qin, Zhenzhen
dc.contributor.author Sun, Ruilong
dc.contributor.author Wang, Chenweng
dc.contributor.author Xiao, Huangyang
dc.contributor.author Yan, Peiyi
dc.contributor.author Yih, Tsy
dc.contributor.author Yu, Ke
dc.contributor.author Yu, Songping
dc.contributor.author Zeng, Si
dc.contributor.author Zhang, Yongchen
dc.contributor.author Zhao, Yun
dc.contributor.author Foufi, Vassiliki
dc.contributor.author Fotopoulou, Aggeliki
dc.contributor.author Markantonatou, Stella
dc.contributor.author Papadelli, Stella
dc.contributor.author Louizou, Sevasti
dc.contributor.author Aduriz, Itziar
dc.contributor.author Estarrona, Ainara
dc.contributor.author Gonzalez, Itziar
dc.contributor.author Gurrutxaga, Antton
dc.contributor.author Uria, Larraitz
dc.contributor.author Urizar, Ruben
dc.contributor.author Foster, Jennifer
dc.contributor.author Lynn, Teresa
dc.contributor.author Elyovitch, Hevi
dc.contributor.author Ha-Cohen Kerner, Yaakov
dc.contributor.author Malka, Ruth
dc.contributor.author Jain, Kanishka
dc.contributor.author Puri, Vandana
dc.contributor.author Ratori, Shraddha
dc.contributor.author Shukla, Vishakha
dc.contributor.author Srivastava, Shubham
dc.contributor.author Berk, Gozde
dc.contributor.author Erden, Berna
dc.contributor.author Yirmibeşoğlu, Zeynep
dc.date.accessioned 2020-10-08T11:08:16Z
dc.date.available 2020-10-08T11:08:16Z
dc.date.issued 2020-07-09
dc.identifier.uri http://hdl.handle.net/11234/1-3367
dc.description This multilingual resource contains corpora in which verbal MWEs have been manually annotated, gathered at the occasion of the 1.2 edition of the PARSEME Shared Task on semi-supervised Identification of Verbal MWEs (2020). VMWEs include idioms (let the cat out of the bag), light-verb constructions (make a decision), verb-particle constructions (give up), inherently reflexive verbs (help oneself), and multi-verb constructions (make do). For the 1.2 shared task edition, the data covers 14 languages, for which VMWEs were annotated according to the universal guidelines. The corpora are provided in the cupt format, inspired by the CONLL-U format. Morphological and syntactic information ­­­­– not necessarily using UD tagsets – including parts of speech, lemmas, morphological features and/or syntactic dependencies are also provided. Depending on the language, the information comes from treebanks (e.g., Universal Dependencies) or from automatic parsers trained on treebanks (e.g., UDPipe). This item contains training, development and test data, as well as the evaluation tools used in the PARSEME Shared Task 1.2 (2020). The annotation guidelines are available online: http://parsemefr.lif.univ-mrs.fr/parseme-st-guidelines/1.2
dc.language.iso deu
dc.language.iso ell
dc.language.iso eus
dc.language.iso fra
dc.language.iso gle
dc.language.iso heb
dc.language.iso hin
dc.language.iso ita
dc.language.iso pol
dc.language.iso por
dc.language.iso ron
dc.language.iso swe
dc.language.iso tur
dc.language.iso zho
dc.publisher PARSEME
dc.relation.replaces http://hdl.handle.net/11372/LRT-2842
dc.relation.isreplacedby http://hdl.handle.net/11372/LRT-5124
dc.rights PARSEME Shared Task Data (v. 1.2) Agreement
dc.rights.uri https://lindat.mff.cuni.cz/repository/xmlui/page/licence-mwe-1.2
dc.source.uri http://multiword.sf.net/sharedtask2020
dc.subject multiword expressions
dc.subject verbal multiword expressions
dc.subject light verb construction
dc.subject verb-particle constructions
dc.subject inherently reflexive verbs
dc.subject verbal idioms
dc.subject multi-verb constructions
dc.title Annotated corpora and tools of the PARSEME Shared Task on Semi-Supervised Identification of Verbal Multiword Expressions (edition 1.2)
dc.type corpus
metashare.ResourceInfo#ContentInfo.mediaType text
dc.rights.label PUB
has.files yes
branding LRT + Open Submissions
contact.person Carlos Ramisch Carlos Ramisch carlos.ramisch@lis-lab.fr Aix-Marseille University
contact.person Agata Savary Agata Savary agata.savary@univ-tours.fr Université de Tours
contact.person Marie Candito Marie Candito marie.candito@gmail.com Université de Paris
size.info 279785 sentences
size.info 5517910 tokens
size.info 68503 multiWordUnits
files.size 94501603
files.count 17


 Soubory tohoto záznamu

 Stáhnout všechny soubory záznamu (90.12 MB)
Licenční kategorie:
Publicly Available

Licence: PARSEME Shared Task Data (v. 1.2) Agreement
GNU General Public License, version 3.0 Distributed under Creative Commons
Icon
Název
README.md
Velikost
6.7 KB
Formát
Neznámý
Popis
General README file
MD5
a8b7e1ba4c2b8b09cf76c040fb5d41ab
 Stáhnout soubor
Icon
Název
trial.tgz
Velikost
113.47 KB
Formát
application/x-gzip
Popis
Trial files (English)
MD5
17c8e72d5cd58194868598f0579ab524
 Stáhnout soubor  Náhled
 Náhled souboru  
  • trial
    • EN-trial.test.cupt8 kB
    • README.md1 kB
    • EN-trial.raw.conllu511 kB
    • EN-trial.train.cupt7 kB
    • EN-trial.test.pred.cupt8 kB
    • EN-trial.test.blind.cupt8 kB
Icon
Název
bin.tgz
Velikost
19.7 KB
Formát
application/x-gzip
Popis
Evaluation scripts
MD5
456e2a812566cc791a0d6be38f507bdd
 Stáhnout soubor  Náhled
 Náhled souboru  
  • bin
    • validate_cupt.py4 kB
    • bmc_munkres
      • LICENSE561 B
      • README.md1 kB
      • munkres.py23 kB
    • evaluate.py23 kB
    • average_of_evaluations.py6 kB
    • tsvlib.py12 kB
    • tsvlib_usage_example.py1 kB
Icon
Název
DE.tgz
Velikost
2.73 MB
Formát
application/x-gzip
Popis
German files
MD5
55d0f986e358739e56573c221c617282
 Stáhnout soubor  Náhled
 Náhled souboru  
  • DE
    • test.blind.cupt2 MB
    • dev-stats.md197 B
    • train-stats.md210 B
    • README.md4 kB
    • dev.cupt856 kB
    • test-stats.md367 B
    • train.cupt9 MB
    • test.cupt2 MB
Icon
Název
EL.tgz
Velikost
9.41 MB
Formát
application/x-gzip
Popis
Greek files
MD5
4dfafa58e1f504f3600d54401beccf24
 Stáhnout soubor  Náhled
 Náhled souboru  
  • EL
    • test.blind.cupt6 MB
    • dev-stats.md177 B
    • train-stats.md190 B
    • README.md3 kB
    • dev.cupt2 MB
    • test-stats.md346 B
    • train.cupt42 MB
    • test.cupt6 MB
Icon
Název
EU.tgz
Velikost
2.95 MB
Formát
application/x-gzip
Popis
Basque files
MD5
a3cdb2376ab2e000820a98e1cf22ba87
 Stáhnout soubor  Náhled
 Náhled souboru  
  • EU
    • test.blind.cupt5 MB
    • dev-stats.md149 B
    • train-stats.md153 B
    • README.md4 kB
    • dev.cupt1 MB
    • test-stats.md318 B
    • train.cupt4 MB
    • test.cupt5 MB
Icon
Název
FR.tgz
Velikost
7.43 MB
Formát
application/x-gzip
Popis
French files
MD5
a3f1331707d34c31b0dc221799a5e8b6
 Stáhnout soubor  Náhled
 Náhled souboru  
  • FR
    • test.blind.cupt7 MB
    • dev-stats.md176 B
    • train-stats.md186 B
    • README.md5 kB
    • dev.cupt2 MB
    • test-stats.md346 B
    • train.cupt22 MB
    • test.cupt7 MB
Icon
Název
GA.tgz
Velikost
802.56 KB
Formát
application/x-gzip
Popis
Irish files
MD5
b554c8424df0d989c4580ad7ab43ce56
 Stáhnout soubor  Náhled
 Náhled souboru  
  • GA
    • test.blind.cupt1 MB
    • dev-stats.md195 B
    • train-stats.md197 B
    • README.md3 kB
    • dev.cupt474 kB
    • test-stats.md379 B
    • train.cupt417 kB
    • test.cupt1 MB
Icon
Název
HE.tgz
Velikost
6.45 MB
Formát
application/x-gzip
Popis
Hebrew files
MD5
384e4e150b958ebc98d2d420feb1f4d8
 Stáhnout soubor  Náhled
 Náhled souboru  
  • HE
    • test.blind.cupt5 MB
    • dev-stats.md165 B
    • train-stats.md175 B
    • README.md3 kB
    • dev.cupt1 MB
    • test-stats.md333 B
    • train.cupt22 MB
    • test.cupt5 MB
Icon
Název
HI.tgz
Velikost
771.32 KB
Formát
application/x-gzip
Popis
Hindi files
MD5
8ea2dd1a1f9082f53234c597e330eaad
 Stáhnout soubor  Náhled
 Náhled souboru  
  • HI
    • test.blind.cupt2 MB
    • dev-stats.md140 B
    • train-stats.md161 B
    • README.md2 kB
    • dev.cupt598 kB
    • test-stats.md327 B
    • train.cupt549 kB
    • test.cupt2 MB
Icon
Název
IT.tgz
Velikost
5.82 MB
Formát
application/x-gzip
Popis
Italian files
MD5
4b6dc92ccf29768d80e7d136285efa09
 Stáhnout soubor  Náhled
 Náhled souboru  
  • IT
    • test.blind.cupt5 MB
    • dev-stats.md224 B
    • train-stats.md253 B
    • README.md8 kB
    • dev.cupt1 MB
    • test-stats.md398 B
    • train.cupt16 MB
    • test.cupt5 MB
Icon
Název
PL.tgz
Velikost
8.29 MB
Formát
application/x-gzip
Popis
Polish files
MD5
f1db1c5bd299d7d4ed1eaa4d76ab91a8
 Stáhnout soubor  Náhled
 Náhled souboru  
  • PL
    • test.blind.cupt7 MB
    • dev-stats.md163 B
    • train-stats.md172 B
    • README.md9 kB
    • dev.cupt2 MB
    • test-stats.md331 B
    • train.cupt30 MB
    • test.cupt7 MB
Icon
Název
PT.tgz
Velikost
10.15 MB
Formát
application/x-gzip
Popis
Portuguese files
MD5
4827ddfb24df8f634543be85b5939609
 Stáhnout soubor  Náhled
 Náhled souboru  
  • PT
    • test.blind.cupt8 MB
    • dev-stats.md174 B
    • train-stats.md184 B
    • README.md7 kB
    • dev.cupt2 MB
    • test-stats.md345 B
    • train.cupt34 MB
    • test.cupt8 MB
Icon
Název
SV.tgz
Velikost
1.22 MB
Formát
application/x-gzip
Popis
Swedish files
MD5
868c003f369f6af324a5699dd4be0726
 Stáhnout soubor  Náhled
 Náhled souboru  
  • SV
    • test.blind.cupt2 MB
    • dev-stats.md178 B
    • train-stats.md203 B
    • README.md3 kB
    • dev.cupt755 kB
    • test-stats.md368 B
    • train.cupt2 MB
    • test.cupt2 MB
Icon
Název
TR.tgz
Velikost
5.21 MB
Formát
application/x-gzip
Popis
Turkish files
MD5
7ed6b16e8fcc30d646d04791ebb54b4a
 Stáhnout soubor  Náhled
 Náhled souboru  
  • TR
    • test.blind.cupt4 MB
    • dev-stats.md142 B
    • train-stats.md149 B
    • README.md4 kB
    • dev.cupt1 MB
    • test-stats.md310 B
    • train.cupt22 MB
    • test.cupt4 MB
Icon
Název
ZH.tgz
Velikost
8.21 MB
Formát
application/x-gzip
Popis
Chinese files
MD5
052a11ca5136136be9d46935765b7a2a
 Stáhnout soubor  Náhled
 Náhled souboru  
  • ZH
    • test.blind.cupt2 MB
    • dev-stats.md181 B
    • train-stats.md192 B
    • README.md3 kB
    • dev.cupt884 kB
    • test-stats.md348 B
    • train.cupt27 MB
    • test.cupt2 MB
Icon
Název
RO.tgz
Velikost
20.59 MB
Formát
application/x-gzip
Popis
Romanian files
MD5
2972d6276e83a15596ec1d527c8689b0
 Stáhnout soubor  Náhled
 Náhled souboru  
  • RO
    • test.blind.cupt50 MB
    • dev-stats.md164 B
    • train-stats.md169 B
    • README.md2 kB
    • dev.cupt9 MB
    • test-stats.md337 B
    • train.cupt14 MB
    • test.cupt50 MB

Zobrazit minimální záznam