dc.contributor.author | Savary, Agata |
dc.contributor.author | Cordeiro, Silvio Ricardo |
dc.contributor.author | Lichte, Timm |
dc.contributor.author | Ramisch, Carlos |
dc.contributor.author | Iñurrieta, Uxoa |
dc.contributor.author | Giouli, Voula |
dc.date.accessioned | 2019-04-05T13:45:35Z |
dc.date.available | 2019-04-05T13:45:35Z |
dc.date.issued | 2019-04-01 |
dc.identifier.uri | http://hdl.handle.net/11372/LRT-2966 |
dc.description | The corpus contains sentences with idiomatic, literal and coincidental occurrences of verbal multiword expressions (VMWEs) in Basque, German, Greek, Polish and Portuguese. The source corpus is the PARSEME multilingual corpus of VMWEs v 1.1 (cf. http://hdl.handle.net/11372/LRT-2842). The sentences with VMWEs were extracted from the source corpus and potential co-occurrences of the same lexemes were automatically extracted from the same corpus. These candidates were then manually annotated by native experts into 6 classes, including literal and coincidental occurrences, as well as various annotation errors. The construction of the corpus is described by the following publication: Agata Savary, Silvio Ricardo Cordeiro, Timm Lichte, Carlos Ramisch, Uxoa Iñurrieta, Voula Giouli (forthcoming) "Literal occurrences of multiword expressions: Rare birds that cause a stir", to appear in Prague Bulletin of Mathematical Linguistics. |
dc.language.iso | eus |
dc.language.iso | deu |
dc.language.iso | ell |
dc.language.iso | pol |
dc.language.iso | por |
dc.publisher | PARSEME |
dc.rights | License agreement for The Multilingual corpus of literal occurrences of multiword expressions |
dc.rights.uri | https://lindat.mff.cuni.cz/repository/xmlui/page/licence-mwe-literal |
dc.subject | verbal multiword expressions |
dc.subject | literal occurrence |
dc.subject | idiomaticity rate |
dc.title | Multilingual corpus of literal occurrences of multiword expressions |
dc.type | corpus |
metashare.ResourceInfo#ContentInfo.mediaType | text |
dc.rights.label | PUB |
has.files | yes |
branding | LRT + Open Submissions |
contact.person | Agata Savary agata.savary@univ-tours.fr University of Tours |
sponsor | ANR (France) ANR-14-CERA-0001 PARSEME-FR nationalFunds |
sponsor | Deutsche Forschungsgemeinschaft (DFG) CRC 991 The Structure of Representations in Language, Cognition, and Science nationalFunds |
size.info | 26754 sentences |
files.size | 2089385 |
files.count | 5 |
Soubory tohoto záznamu
Stáhnout všechny soubory záznamu (1.99 MB)Licenční kategorie:
Licence: License agreement for The Multilingual corpus of literal occurrences of multiword expressions
Publicly Available
Licence: License agreement for The Multilingual corpus of literal occurrences of multiword expressions
- Název
- DE.tgz
- Velikost
- 368.23 KB
- Formát
- application/x-gzip
- Popis
- German data and README
- MD5
- 6512c4688561caa99bee516332873ad8
- DE
- README.md2 kB
- DE.tsv1 MB
- Název
- PT.tgz
- Velikost
- 533.87 KB
- Formát
- application/x-gzip
- Popis
- Portuguese data and README
- MD5
- 7513dad43afc0343365e15471ef1a085
- PT
- PT.tsv1 MB
- README.md2 kB
- Název
- PL.tgz
- Velikost
- 403.88 KB
- Formát
- application/x-gzip
- Popis
- Polish data and README
- MD5
- 1451c2c4d5be016ba3fdcad4851e7f67
- PL
- README.md3 kB
- PL.tsv1 MB
- Název
- EU.tgz
- Velikost
- 383.8 KB
- Formát
- application/x-gzip
- Popis
- Basque data and README
- MD5
- 525686bdbc7b3da0f9e1331a8b3046eb
- EU
- README.md2 kB
- EU.tsv1 MB
- Název
- EL.tgz
- Velikost
- 350.63 KB
- Formát
- application/x-gzip
- Popis
- Greek data and README
- MD5
- 6bdb435486338ab3898daf0154e9c0bd
- EL
- README.md2 kB
- EL.tsv1 MB