Skip to search
Skip to main content
Skip to first result
Search
Search Results
Creator:
Hajič, Jan , Bejček, Eduard , Bémová, Alevtina , Buráňová, Eva , Fučíková, Eva , Hajičová, Eva , Havelka, Jiří , Hlaváčová, Jaroslava , Homola, Petr , Ircing, Pavel , Kárník, Jiří , Kettnerová, Václava , Klyueva, Natalia , Kolářová, Veronika , Kučová, Lucie , Lopatková, Markéta , Mareček, David , Mikulová, Marie , Mírovský, Jiří , Nedoluzhko, Anna , Novák, Michal , Pajas, Petr , Panevová, Jarmila , Peterek, Nino , Poláková, Lucie , Popel, Martin , Popelka, Jan , Romportl, Jan , Rysová, Magdaléna , Semecký, Jiří , Sgall, Petr , Spoustová, Johanka , Straka, Milan , Straňák, Pavel , Synková, Pavlína , Ševčíková, Magda , Šindlerová, Jana , Štěpánek, Jan , Štěpánková, Barbora , Toman, Josef , Urešová, Zdeňka , Vidová Hladká, Barbora , Zeman, Daniel , Zikánová, Šárka , and Žabokrtský, Zdeněk
Publisher:
Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:
text and corpus
Subject:
treebank , dependency , tectogrammatics , topic-focus articulation , multiword expressions , coreference , bridging relations , discourse , morphology , syntax , tokenization , lemmatization , semantic relations , lexical semantics , lexicon , valency , speech reconstruction , clauses , speech recognition , and spoken corpus
Language:
Czech
Description:
A richly annotated and genre-diversified language resource, The Prague Dependency Treebank – Consolidated 1.0 (PDT-C 1.0, or PDT-C in short in the sequel) is a consolidated release of the existing PDT-corpora of Czech data, uniformly annotated using the standard PDT scheme. PDT-corpora included in PDT-C: Prague Dependency Treebank (the original PDT contents, written newspaper and journal texts from three genres); Czech part of Prague Czech-English Dependency Treebank (translated financial texts, from English), Prague Dependency Treebank of Spoken Czech (spoken data, including audio and transcripts and multiple speech reconstruction annotation); PDT-Faust (user-generated texts). The difference from the separately published original treebanks can be briefly described as follows: it is published in one package, to allow easier data handling for all the datasets; the data is enhanced with a manual linguistic annotation at the morphological layer and new version of morphological dictionary is enclosed; a common valency lexicon for all four original parts is enclosed. Documentation provides two browsing and editing desktop tools (TrEd and MEd) and the corpus is also available online for searching using PML-TQ.
Rights:
Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) , http://creativecommons.org/licenses/by-nc-sa/4.0/ , and PUB
Creator:
Bejček, Eduard , Hajičová, Eva , Hajič, Jan , Jínová, Pavlína , Kettnerová, Václava , Kolářová, Veronika , Mikulová, Marie , Mírovský, Jiří , Nedoluzhko, Anna , Panevová, Jarmila , Poláková, Lucie , Ševčíková, Magda , Štěpánek, Jan , and Zikánová, Šárka
Publisher:
Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:
text and corpus
Subject:
treebank , dependency , tectogrammatics , topic-focus articulation , multiword expressions , coreference , bridging relations , discourse , and PDT
Language:
Czech
Description:
PDT 3.0 is a new version of Prague Dependency Treebank. It contains a large amount of Czech texts with complex and interlinked morphological (2 million words), syntactic (1.5 MW) and semantic annotation (0.8 MW); in addition, certain properties of sentence information structure, multiword expressions, coreference, bridging relations and discourse relations are annotated at the semantic level. and the Grant Agency of the Czech Republic: grants P406/12/0658 "Coreference, discourse relations and information structure in a contrastive perspective", P406/2010/0875 "Computational Linguistics: Explicit description of language and annotated data focused on Czech", 405/09/0729 "From the structure of a sentence to textual relationships", and GPP406/12/P175 (Selected derivational relations for automatic processing of Czech);
the Ministry of Education, Youth and Sports of the Czech Republic: the KONTAKT project ME10018 "Towards a computational analysis of text structure" and the LINDAT-Clarin project LM2010013;
the Grant Agency of Charles University in Prague: GAUK 103609 "Textual (Inter-sentential) Relations and their Representation in a Language Corpus" and GAUK 4383/2009 "Methods of coreference resolution".
Rights:
Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0) , http://creativecommons.org/licenses/by-nc-sa/3.0/ , and PUB
Creator:
Hajič, Jan , Bejček, Eduard , Bémová, Alevtina , Buráňová, Eva , Hajičová, Eva , Havelka, Jiří , Homola, Petr , Kárník, Jiří , Kettnerová, Václava , Klyueva, Natalia , Kolářová, Veronika , Kučová, Lucie , Lopatková, Markéta , Mikulová, Marie , Mírovský, Jiří , Nedoluzhko, Anna , Pajas, Petr , Panevová, Jarmila , Poláková, Lucie , Rysová, Magdaléna , Sgall, Petr , Spoustová, Johanka , Straňák, Pavel , Synková, Pavlína , Ševčíková, Magda , Štěpánek, Jan , Urešová, Zdeňka , Vidová Hladká, Barbora , Zeman, Daniel , Zikánová, Šárka , and Žabokrtský, Zdeněk
Publisher:
Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:
text and corpus
Subject:
treebank , dependency , tectogrammatics , topic-focus articulation , multiword expressions , coreference , bridging relations , discourse , morphology , syntax , tokenization , lemmatization , clauses , semantics , semantic relations , lexical semantics , and lexicon
Language:
Czech
Description:
The Prague Dependency Treebank 3.5 is the 2018 edition of the core Prague Dependency Treebank (PDT). It contains all PDT annotation made at the Institute of Formal and Applied Linguistics under various projects between 1996 and 2018 on the original texts, i.e., all annotation from PDT 1.0, PDT 2.0, PDT 2.5, PDT 3.0, PDiT 1.0 and PDiT 2.0, plus corrections, new structure of basic documentation and new list of authors covering all previous editions. The Prague Dependency Treebank 3.5 (PDT 3.5) contains the same texts as the previous versions since 2.0; there are 49,431 annotated sentences (832,823 words) on all layers, from tectogrammatical annotation to syntax to morphology. There are additional annotated sentences for syntax and morphology; the totals for the lower layers of annotation are: 87,913 sentences with 1,502,976 words at the analytical layer (surface dependency syntax) and 115,844 sentences with 1,956,693 words at the morphological layer of annotation (these totals include the annotation with the higher layers annotated as well). Closely linked to the tectogrammatical layer is the annotation of sentence information structure, multiword expressions, coreference, bridging relations and discourse relations.
Rights:
Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) , http://creativecommons.org/licenses/by-nc-sa/4.0/ , and PUB
Creator:
Mikulová, Marie , Bémová, Alevtina , Hajič, Jan , Hajičová, Eva , Ircing, Pavel , Kolářová, Veronika , Lopatková, Markéta , Mareček, David , Mírovský, Jiří , Nedoluzhko, Anna , Pajas, Petr , Panevová, Jarmila , Peterek, Nino , Romportl, Jan , Sgall, Petr , Ševčíková, Magda , Štěpánek, Jan , Urešová, Zdeňka , and Žabokrtský, Zdeněk
Publisher:
Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:
text and corpus
Subject:
spoken corpus , speech reconstruction , speech recognition , syntax , semantics , coreference , and audio
Language:
Czech
Description:
The Prague Dependency Treebank of Spoken Czech 2.0 (PDTSC 2.0) is a corpus of spoken language, consisting of 742,316 tokens and 73,835 sentences, representing 7,324 minutes (over 120 hours) of spontaneous dialogs. The dialogs have been recorded, transcribed and edited in several interlinked layers: audio recordings, automatic and manual transcripts and manually reconstructed text. These layers were part of the first version of the corpus (PDTSC 1.0). Version 2.0 is extended by an automatic dependency parser at the analytical and by the manual annotation of “deep” syntax at the tectogrammatical layer, which contains semantic roles and relations as well as annotation of coreference.
Rights:
Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) , http://creativecommons.org/licenses/by-nc-sa/4.0/ , and PUB
Creator:
Poláková, Lucie , Jínová, Pavlína , Zikánová, Šárka , Hajičová, Eva , Mírovský, Jiří , Nedoluzhko, Anna , Rysová, Magdaléna , Pavlíková, Veronika , Zdeňková, Jana , Pergler, Jiří , and Ocelák, Radek
Publisher:
Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:
text and corpus
Subject:
discourse , treebank , and annotation
Language:
Czech
Description:
Annotation of discourse relations is a project related to the Prague Dependency Treebank 2.5. It represents a new manually annotated layer of language description, above the existing layers of the PDT, and it portrays linguistic phenomena from the perspective of discourse structure and coherence. and GACR P406/12/0658, GACR P406/2010/0875, GACR 405/09/0729, Ministry of Education ME10018, Ministry of Education LM2010013
Rights:
Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0) , http://creativecommons.org/licenses/by-nc-sa/3.0/ , and PUB
Creator:
Rysová, Magdaléna , Synková, Pavlína , Mírovský, Jiří , Hajičová, Eva , Nedoluzhko, Anna , Ocelák, Radek , Pergler, Jiří , Poláková, Lucie , Scheller, Veronika , Zdeňková, Jana , and Zikánová, Šárka
Publisher:
Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:
text and corpus
Subject:
discourse , bridging relations , coreference , topic-focus articulation , treebank , dependency , tectogrammatics , and PDT
Language:
Czech
Description:
PDiT 2.0 is a new version of the Prague Discourse Treebank. It contains a complex annotation of discourse phenomena enriched by the annotation of secondary connectives.
Rights:
Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) , http://creativecommons.org/licenses/by-nc-sa/4.0/ , and PUB
Creator:
Nivre, Joakim , Agić, Željko , Ahrenberg, Lars , Antonsen, Lene , Aranzabe, Maria Jesus , Asahara, Masayuki , Ateyah, Luma , Attia, Mohammed , Atutxa, Aitziber , Badmaeva, Elena , Ballesteros, Miguel , Banerjee, Esha , Bank, Sebastian , Bauer, John , Bengoetxea, Kepa , Bhat, Riyaz Ahmad , Bick, Eckhard , Bosco, Cristina , Bouma, Gosse , Bowman, Sam , Burchardt, Aljoscha , Candito, Marie , Caron, Gauthier , Cebiroğlu Eryiğit, Gülşen , Celano, Giuseppe G. A. , Cetin, Savas , Chalub, Fabricio , Choi, Jinho , Cho, Yongseok , Cinková, Silvie , Çöltekin, Çağrı , Connor, Miriam , de Marneffe, Marie-Catherine , de Paiva, Valeria , Diaz de Ilarraza, Arantza , Dobrovoljc, Kaja , Dozat, Timothy , Droganova, Kira , Eli, Marhaba , Elkahky, Ali , Erjavec, Tomaž , Farkas, Richárd , Fernandez Alcalde, Hector , Foster, Jennifer , Freitas, Cláudia , Gajdošová, Katarína , Galbraith, Daniel , Garcia, Marcos , Ginter, Filip , Goenaga, Iakes , Gojenola, Koldo , Gökırmak, Memduh , Goldberg, Yoav , Gómez Guinovart, Xavier , Gonzáles Saavedra, Berta , Grioni, Matias , Grūzītis, Normunds , Guillaume, Bruno , Habash, Nizar , Hajič, Jan , Hajič jr., Jan , Hà Mỹ, Linh , Harris, Kim , Haug, Dag , Hladká, Barbora , Hlaváčová, Jaroslava , Hohle, Petter , Ion, Radu , Irimia, Elena , Johannsen, Anders , Jørgensen, Fredrik , Kaşıkara, Hüner , Kanayama, Hiroshi , Kanerva, Jenna , Kayadelen, Tolga , Kettnerová, Václava , Kirchner, Jesse , Kotsyba, Natalia , Krek, Simon , Kwak, Sookyoung , Laippala, Veronika , Lambertino, Lorenzo , Lando, Tatiana , Lê Hồng, Phương , Lenci, Alessandro , Lertpradit, Saran , Leung, Herman , Li, Cheuk Ying , Li, Josie , Ljubešić, Nikola , Loginova, Olga , Lyashevskaya, Olga , Lynn, Teresa , Macketanz, Vivien , Makazhanov, Aibek , Mandl, Michael , Manning, Christopher , Manurung, Ruli , Mărănduc, Cătălina , Mareček, David , Marheinecke, Katrin , Martínez Alonso, Héctor , Martins, André , Mašek, Jan , Matsumoto, Yuji , McDonald, Ryan , Mendonça, Gustavo , Missilä, Anna , Mititelu, Verginica , Miyao, Yusuke , Montemagni, Simonetta , More, Amir , Moreno Romero, Laura , Mori, Shunsuke , Moskalevskyi, Bohdan , Muischnek, Kadri , Mustafina, Nina , Müürisep, Kaili , Nainwani, Pinkey , Nedoluzhko, Anna , Nguyễn Thị, Lương , Nguyễn Thị Minh, Huyền , Nikolaev, Vitaly , Nitisaroj, Rattima , Nurmi, Hanna , Ojala, Stina , Osenova, Petya , Øvrelid, Lilja , Pascual, Elena , Passarotti, Marco , Perez, Cenel-Augusto , Perrier, Guy , Petrov, Slav , Piitulainen, Jussi , Pitler, Emily , Plank, Barbara , Popel, Martin , Pretkalniņa, Lauma , Prokopidis, Prokopis , Puolakainen, Tiina , Pyysalo, Sampo , Rademaker, Alexandre , Real, Livy , Reddy, Siva , Rehm, Georg , Rinaldi, Larissa , Rituma, Laura , Rosa, Rudolf , Rovati, Davide , Saleh, Shadi , Sanguinetti, Manuela , Saulīte, Baiba , Sawanakunanon, Yanin , Schuster, Sebastian , Seddah, Djamé , Seeker, Wolfgang , Seraji, Mojgan , Shakurova, Lena , Shen, Mo , Shimada, Atsuko , Shohibussirri, Muh , Silveira, Natalia , Simi, Maria , Simionescu, Radu , Simkó, Katalin , Šimková, Mária , Simov, Kiril , Smith, Aaron , Stella, Antonio , Strnadová, Jana , Suhr, Alane , Sulubacak, Umut , Szántó, Zsolt , Taji, Dima , Tanaka, Takaaki , Trosterud, Trond , Trukhina, Anna , Tsarfaty, Reut , Tyers, Francis , Uematsu, Sumire , Urešová, Zdeňka , Uria, Larraitz , Uszkoreit, Hans , van Noord, Gertjan , Varga, Viktor , Vincze, Veronika , Washington, Jonathan North , Yu, Zhuoran , Žabokrtský, Zdeněk , Zeman, Daniel , and Zhu, Hanzhi
Publisher:
Universal Dependencies Consortium
Type:
text and corpus
Subject:
treebank , dependency , syntax , morphology , harmonized annotation , interset , universal tagset , and stanford dependencies
Language:
Ancient Greek (to 1453) , Arabic , Basque , Bulgarian , Croatian , Czech , Danish , Dutch , English , Estonian , Finnish , French , German , Gothic , Modern Greek (1453-) , Hebrew , Hindi , Hungarian , Indonesian , Irish , Italian , Japanese , Latin , Norwegian , Church Slavic , Persian , Polish , Portuguese , Romanian , Slovenian , Spanish , Swedish , Tamil , Catalan , Chinese , Galician , Kazakh , Latvian , Russian , Turkish , Coptic , Sanskrit , Slovak , Ukrainian , Uighur , Vietnamese , Belarusian , Korean , Lithuanian , Urdu , Northern Sami , Upper Sorbian , Russia Buriat , and Northern Kurdish
Description:
Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development, cross-lingual learning, and parsing research from a language typology perspective. The annotation scheme is based on (universal) Stanford dependencies (de Marneffe et al., 2006, 2008, 2014), Google universal part-of-speech tags (Petrov et al., 2012), and the Interset interlingua for morphosyntactic tagsets (Zeman, 2008).
This release contains the test data used in the CoNLL 2017 shared task on parsing Universal Dependencies. Due to the shared task the test data was held hidden and not released together with the training and development data of UD 2.0. Therefore this release complements the UD 2.0 release (http://hdl.handle.net/11234/1-1983) to a full release of UD treebanks. In addition, the present release contains 18 new parallel test sets and 4 test sets in surprise languages. The present release also includes the development data already released with UD 2.0. Unlike regular UD releases, this one uses the folder-file structure that was visible to the systems participating in the shared task.
Rights:
Licence Universal Dependencies v2.0 , https://lindat.mff.cuni.cz/repository/xmlui/page/licence-UD-2.0 , and PUB
Creator:
Nivre, Joakim , Agić, Željko , Ahrenberg, Lars , Antonsen, Lene , Aranzabe, Maria Jesus , Asahara, Masayuki , Ateyah, Luma , Attia, Mohammed , Atutxa, Aitziber , Augustinus, Liesbeth , Badmaeva, Elena , Ballesteros, Miguel , Banerjee, Esha , Bank, Sebastian , Barbu Mititelu, Verginica , Bauer, John , Bengoetxea, Kepa , Bhat, Riyaz Ahmad , Bick, Eckhard , Bobicev, Victoria , Börstell, Carl , Bosco, Cristina , Bouma, Gosse , Bowman, Sam , Burchardt, Aljoscha , Candito, Marie , Caron, Gauthier , Cebiroğlu Eryiğit, Gülşen , Celano, Giuseppe G. A. , Cetin, Savas , Chalub, Fabricio , Choi, Jinho , Cinková, Silvie , Çöltekin, Çağrı , Connor, Miriam , Davidson, Elizabeth , de Marneffe, Marie-Catherine , de Paiva, Valeria , Diaz de Ilarraza, Arantza , Dirix, Peter , Dobrovoljc, Kaja , Dozat, Timothy , Droganova, Kira , Dwivedi, Puneet , Eli, Marhaba , Elkahky, Ali , Erjavec, Tomaž , Farkas, Richárd , Fernandez Alcalde, Hector , Foster, Jennifer , Freitas, Cláudia , Gajdošová, Katarína , Galbraith, Daniel , Garcia, Marcos , Gärdenfors, Moa , Gerdes, Kim , Ginter, Filip , Goenaga, Iakes , Gojenola, Koldo , Gökırmak, Memduh , Goldberg, Yoav , Gómez Guinovart, Xavier , Gonzáles Saavedra, Berta , Grioni, Matias , Grūzītis, Normunds , Guillaume, Bruno , Habash, Nizar , Hajič, Jan , Hajič jr., Jan , Hà Mỹ, Linh , Harris, Kim , Haug, Dag , Hladká, Barbora , Hlaváčová, Jaroslava , Hociung, Florinel , Hohle, Petter , Ion, Radu , Irimia, Elena , Jelínek, Tomáš , Johannsen, Anders , Jørgensen, Fredrik , Kaşıkara, Hüner , Kanayama, Hiroshi , Kanerva, Jenna , Kayadelen, Tolga , Kettnerová, Václava , Kirchner, Jesse , Kotsyba, Natalia , Krek, Simon , Laippala, Veronika , Lambertino, Lorenzo , Lando, Tatiana , Lee, John , Lê Hồng, Phương , Lenci, Alessandro , Lertpradit, Saran , Leung, Herman , Li, Cheuk Ying , Li, Josie , Li, Keying , Ljubešić, Nikola , Loginova, Olga , Lyashevskaya, Olga , Lynn, Teresa , Macketanz, Vivien , Makazhanov, Aibek , Mandl, Michael , Manning, Christopher , Mărănduc, Cătălina , Mareček, David , Marheinecke, Katrin , Martínez Alonso, Héctor , Martins, André , Mašek, Jan , Matsumoto, Yuji , McDonald, Ryan , Mendonça, Gustavo , Miekka, Niko , Missilä, Anna , Mititelu, Cătălin , Miyao, Yusuke , Montemagni, Simonetta , More, Amir , Moreno Romero, Laura , Mori, Shinsuke , Moskalevskyi, Bohdan , Muischnek, Kadri , Müürisep, Kaili , Nainwani, Pinkey , Nedoluzhko, Anna , Nešpore-Bērzkalne, Gunta , Nguyễn Thị, Lương , Nguyễn Thị Minh, Huyền , Nikolaev, Vitaly , Nurmi, Hanna , Ojala, Stina , Osenova, Petya , Östling, Robert , Øvrelid, Lilja , Pascual, Elena , Passarotti, Marco , Perez, Cenel-Augusto , Perrier, Guy , Petrov, Slav , Piitulainen, Jussi , Pitler, Emily , Plank, Barbara , Popel, Martin , Pretkalniņa, Lauma , Prokopidis, Prokopis , Puolakainen, Tiina , Pyysalo, Sampo , Rademaker, Alexandre , Ramasamy, Loganathan , Rama, Taraka , Ravishankar, Vinit , Real, Livy , Reddy, Siva , Rehm, Georg , Rinaldi, Larissa , Rituma, Laura , Romanenko, Mykhailo , Rosa, Rudolf , Rovati, Davide , Sagot, Benoît , Saleh, Shadi , Samardžić, Tanja , Sanguinetti, Manuela , Saulīte, Baiba , Schuster, Sebastian , Seddah, Djamé , Seeker, Wolfgang , Seraji, Mojgan , Shen, Mo , Shimada, Atsuko , Sichinava, Dmitry , Silveira, Natalia , Simi, Maria , Simionescu, Radu , Simkó, Katalin , Šimková, Mária , Simov, Kiril , Smith, Aaron , Stella, Antonio , Straka, Milan , Strnadová, Jana , Suhr, Alane , Sulubacak, Umut , Szántó, Zsolt , Taji, Dima , Tanaka, Takaaki , Trosterud, Trond , Trukhina, Anna , Tsarfaty, Reut , Tyers, Francis , Uematsu, Sumire , Urešová, Zdeňka , Uria, Larraitz , Uszkoreit, Hans , Vajjala, Sowmya , van Niekerk, Daniel , van Noord, Gertjan , Varga, Viktor , Villemonte de la Clergerie, Eric , Vincze, Veronika , Wallin, Lars , Washington, Jonathan North , Wirén, Mats , Wong, Tak-sum , Yu, Zhuoran , Žabokrtský, Zdeněk , Zeldes, Amir , Zeman, Daniel , and Zhu, Hanzhi
Publisher:
Universal Dependencies Consortium
Type:
text and corpus
Subject:
treebank , dependency , syntax , morphology , harmonized annotation , interset , universal tagset , and stanford dependencies
Language:
Ancient Greek (to 1453) , Arabic , Basque , Bulgarian , Croatian , Czech , Danish , Dutch , English , Estonian , Finnish , French , German , Gothic , Modern Greek (1453-) , Hebrew , Hindi , Hungarian , Indonesian , Irish , Italian , Japanese , Latin , Norwegian , Church Slavic , Persian , Polish , Portuguese , Romanian , Slovenian , Spanish , Swedish , Tamil , Catalan , Chinese , Galician , Kazakh , Latvian , Russian , Turkish , Coptic , Sanskrit , Slovak , Ukrainian , Uighur , Vietnamese , Belarusian , Korean , Lithuanian , Urdu , Russia Buriat , Northern Kurdish , Northern Sami , Upper Sorbian , Afrikaans , Yue Chinese , Marathi , Serbian , Swedish Sign Language , and Telugu
Description:
Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development, cross-lingual learning, and parsing research from a language typology perspective. The annotation scheme is based on (universal) Stanford dependencies (de Marneffe et al., 2006, 2008, 2014), Google universal part-of-speech tags (Petrov et al., 2012), and the Interset interlingua for morphosyntactic tagsets (Zeman, 2008).
Rights:
Licence Universal Dependencies v2.1 , https://lindat.mff.cuni.cz/repository/xmlui/page/licence-UD-2.1 , and PUB
Creator:
Zeman, Daniel , Nivre, Joakim , Abrams, Mitchell , Ackermann, Elia , Aepli, Noëmi , Aghaei, Hamid , Agić, Željko , Ahmadi, Amir , Ahrenberg, Lars , Ajede, Chika Kennedy , Aleksandravičiūtė, Gabrielė , Alfina, Ika , Algom, Avner , Andersen, Erik , Antonsen, Lene , Aplonova, Katya , Aquino, Angelina , Aragon, Carolina , Aranes, Glyd , Aranzabe, Maria Jesus , Arıcan, Bilge Nas , Arnardóttir, Þórunn , Arutie, Gashaw , Arwidarasti, Jessica Naraiswari , Asahara, Masayuki , Aslan, Deniz Baran , Asmazoğlu, Cengiz , Ateyah, Luma , Atmaca, Furkan , Attia, Mohammed , Atutxa, Aitziber , Augustinus, Liesbeth , Badmaeva, Elena , Balasubramani, Keerthana , Ballesteros, Miguel , Banerjee, Esha , Bank, Sebastian , Barbu Mititelu, Verginica , Barkarson, Starkaður , Basile, Rodolfo , Basmov, Victoria , Batchelor, Colin , Bauer, John , Bedir, Seyyit Talha , Bengoetxea, Kepa , Ben Moshe, Yifat , Berk, Gözde , Berzak, Yevgeni , Bhat, Irshad Ahmad , Bhat, Riyaz Ahmad , Biagetti, Erica , Bick, Eckhard , Bielinskienė, Agnė , Bjarnadóttir, Kristín , Blokland, Rogier , Bobicev, Victoria , Boizou, Loïc , Borges Völker, Emanuel , Börstell, Carl , Bosco, Cristina , Bouma, Gosse , Bowman, Sam , Boyd, Adriane , Braggaar, Anouck , Brokaitė, Kristina , Burchardt, Aljoscha , Candito, Marie , Caron, Bernard , Caron, Gauthier , Cassidy, Lauren , Cavalcanti, Tatiana , Cebiroğlu Eryiğit, Gülşen , Cecchini, Flavio Massimiliano , Celano, Giuseppe G. A. , Čéplö, Slavomír , Cesur, Neslihan , Cetin, Savas , Çetinoğlu, Özlem , Chalub, Fabricio , Chauhan, Shweta , Chi, Ethan , Chika, Taishi , Cho, Yongseok , Choi, Jinho , Chun, Jayeol , Chung, Juyeon , Cignarella, Alessandra T. , Cinková, Silvie , Collomb, Aurélie , Çöltekin, Çağrı , Connor, Miriam , Corbetta, Daniela , Courtin, Marine , Cristescu, Mihaela , Daniel, Philemon , Davidson, Elizabeth , Dehouck, Mathieu , de Laurentiis, Martina , de Marneffe, Marie-Catherine , de Paiva, Valeria , Derin, Mehmet Oguz , de Souza, Elvis , Diaz de Ilarraza, Arantza , Dickerson, Carly , Dinakaramani, Arawinda , Di Nuovo, Elisa , Dione, Bamba , Dirix, Peter , Dobrovoljc, Kaja , Dozat, Timothy , Droganova, Kira , Dwivedi, Puneet , Eckhoff, Hanne , Eiche, Sandra , Eli, Marhaba , Elkahky, Ali , Ephrem, Binyam , Erina, Olga , Erjavec, Tomaž , Etienne, Aline , Evelyn, Wograine , Facundes, Sidney , Farkas, Richárd , Favero, Federica , Ferdaousi, Jannatul , Fernanda, Marília , Fernandez Alcalde, Hector , Foster, Jennifer , Freitas, Cláudia , Fujita, Kazunori , Gajdošová, Katarína , Galbraith, Daniel , Gamba, Federica , Garcia, Marcos , Gärdenfors, Moa , Garza, Sebastian , Gerardi, Fabrício Ferraz , Gerdes, Kim , Ginter, Filip , Godoy, Gustavo , Goenaga, Iakes , Gojenola, Koldo , Gökırmak, Memduh , Goldberg, Yoav , Gómez Guinovart, Xavier , González Saavedra, Berta , Griciūtė, Bernadeta , Grioni, Matias , Grobol, Loïc , Grūzītis, Normunds , Guillaume, Bruno , Guillot-Barbance, Céline , Güngör, Tunga , Habash, Nizar , Hafsteinsson, Hinrik , Hajič, Jan , Hajič jr., Jan , Hämäläinen, Mika , Hà Mỹ, Linh , Han, Na-Rae , Hanifmuti, Muhammad Yudistira , Harada, Takahiro , Hardwick, Sam , Harris, Kim , Haug, Dag , Heinecke, Johannes , Hellwig, Oliver , Hennig, Felix , Hladká, Barbora , Hlaváčová, Jaroslava , Hociung, Florinel , Hohle, Petter , Hwang, Jena , Ikeda, Takumi , Ingason, Anton Karl , Ion, Radu , Irimia, Elena , Ishola, Ọlájídé , Ito, Kaoru , Jannat, Siratun , Jelínek, Tomáš , Jha, Apoorva , Johannsen, Anders , Jónsdóttir, Hildur , Jørgensen, Fredrik , Juutinen, Markus , K, Sarveswaran , Kaşıkara, Hüner , Kaasen, Andre , Kabaeva, Nadezhda , Kahane, Sylvain , Kanayama, Hiroshi , Kanerva, Jenna , Kara, Neslihan , Karahóǧa, Ritván , Katz, Boris , Kayadelen, Tolga , Kenney, Jessica , Kettnerová, Václava , Kirchner, Jesse , Klementieva, Elena , Klyachko, Elena , Köhn, Arne , Köksal, Abdullatif , Kopacewicz, Kamil , Korkiakangas, Timo , Köse, Mehmet , Kotsyba, Natalia , Kovalevskaitė, Jolanta , Krek, Simon , Krishnamurthy, Parameswari , Kübler, Sandra , Kuyrukçu, Oğuzhan , Kuzgun, Aslı , Kwak, Sookyoung , Laippala, Veronika , Lam, Lucia , Lambertino, Lorenzo , Lando, Tatiana , Larasati, Septina Dian , Lavrentiev, Alexei , Lee, John , Lê Hồng, Phương , Lenci, Alessandro , Lertpradit, Saran , Leung, Herman , Levina, Maria , Li, Cheuk Ying , Li, Josie , Li, Keying , Li, Yuan , Lim, KyungTae , Lima Padovani, Bruna , Lindén, Krister , Ljubešić, Nikola , Loginova, Olga , Lusito, Stefano , Luthfi, Andry , Luukko, Mikko , Lyashevskaya, Olga , Lynn, Teresa , Macketanz, Vivien , Mahamdi, Menel , Maillard, Jean , Makazhanov, Aibek , Mandl, Michael , Manning, Christopher , Manurung, Ruli , Marşan, Büşra , Mărănduc, Cătălina , Mareček, David , Marheinecke, Katrin , Markantonatou, Stella , Martínez Alonso, Héctor , Martín Rodríguez, Lorena , Martins, André , Mašek, Jan , Matsuda, Hiroshi , Matsumoto, Yuji , Mazzei, Alessandro , McDonald, Ryan , McGuinness, Sarah , Mendonça, Gustavo , Merzhevich, Tatiana , Miekka, Niko , Mischenkova, Karina , Misirpashayeva, Margarita , Missilä, Anna , Mititelu, Cătălin , Mitrofan, Maria , Miyao, Yusuke , Mojiri Foroushani, AmirHossein , Molnár, Judit , Moloodi, Amirsaeid , Montemagni, Simonetta , More, Amir , Moreno Romero, Laura , Moretti, Giovanni , Mori, Keiko Sophie , Mori, Shinsuke , Morioka, Tomohiko , Moro, Shigeki , Mortensen, Bjartur , Moskalevskyi, Bohdan , Muischnek, Kadri , Munro, Robert , Murawaki, Yugo , Müürisep, Kaili , Nainwani, Pinkey , Nakhlé, Mariam , Navarro Horñiacek, Juan Ignacio , Nedoluzhko, Anna , Nešpore-Bērzkalne, Gunta , Nevaci, Manuela , Nguyễn Thị, Lương , Nguyễn Thị Minh, Huyền , Nikaido, Yoshihiro , Nikolaev, Vitaly , Nitisaroj, Rattima , Nourian, Alireza , Nurmi, Hanna , Ojala, Stina , Ojha, Atul Kr. , Olúòkun, Adédayọ̀ , Omura, Mai , Onwuegbuzia, Emeka , Ordan, Noam , Osenova, Petya , Östling, Robert , Øvrelid, Lilja , Özateş, Şaziye Betül , Özçelik, Merve , Özgür, Arzucan , Öztürk Başaran, Balkız , Paccosi, Teresa , Palmero Aprosio, Alessio , Park, Hyunji Hayley , Partanen, Niko , Pascual, Elena , Passarotti, Marco , Patejuk, Agnieszka , Paulino-Passos, Guilherme , Pedonese, Giulia , Peljak-Łapińska, Angelika , Peng, Siyao , Perez, Cenel-Augusto , Perkova, Natalia , Perrier, Guy , Petrov, Slav , Petrova, Daria , Peverelli, Andrea , Phelan, Jason , Piitulainen, Jussi , Pirinen, Tommi A , Pitler, Emily , Plank, Barbara , Poibeau, Thierry , Ponomareva, Larisa , Popel, Martin , Pretkalniņa, Lauma , Prévost, Sophie , Prokopidis, Prokopis , Przepiórkowski, Adam , Puolakainen, Tiina , Pyysalo, Sampo , Qi, Peng , Rääbis, Andriela , Rademaker, Alexandre , Rahoman, Mizanur , Rama, Taraka , Ramasamy, Loganathan , Ramisch, Carlos , Rashel, Fam , Rasooli, Mohammad Sadegh , Ravishankar, Vinit , Real, Livy , Rebeja, Petru , Reddy, Siva , Regnault, Mathilde , Rehm, Georg , Riabov, Ivan , Rießler, Michael , Rimkutė, Erika , Rinaldi, Larissa , Rituma, Laura , Rizqiyah, Putri , Rocha, Luisa , Rögnvaldsson, Eiríkur , Romanenko, Mykhailo , Rosa, Rudolf , Roșca, Valentin , Rovati, Davide , Rozonoyer, Ben , Rudina, Olga , Rueter, Jack , Rúnarsson, Kristján , Sadde, Shoval , Safari, Pegah , Sagot, Benoît , Sahala, Aleksi , Saleh, Shadi , Salomoni, Alessio , Samardžić, Tanja , Samson, Stephanie , Sanguinetti, Manuela , Sanıyar, Ezgi , Särg, Dage , Saulīte, Baiba , Sawanakunanon, Yanin , Saxena, Shefali , Scannell, Kevin , Scarlata, Salvatore , Schneider, Nathan , Schuster, Sebastian , Schwartz, Lane , Seddah, Djamé , Seeker, Wolfgang , Seraji, Mojgan , Shahzadi, Syeda , Shen, Mo , Shimada, Atsuko , Shirasu, Hiroyuki , Shishkina, Yana , Shohibussirri, Muh , Sichinava, Dmitry , Siewert, Janine , Sigurðsson, Einar Freyr , Silveira, Aline , Silveira, Natalia , Simi, Maria , Simionescu, Radu , Simkó, Katalin , Šimková, Mária , Simov, Kiril , Skachedubova, Maria , Smith, Aaron , Soares-Bastos, Isabela , Sourov, Shafi , Spadine, Carolyn , Sprugnoli, Rachele , Stamou, Vivian , Steingrímsson, Steinþór , Stella, Antonio , Straka, Milan , Strickland, Emmett , Strnadová, Jana , Suhr, Alane , Sulestio, Yogi Lesmana , Sulubacak, Umut , Suzuki, Shingo , Swanson, Daniel , Szántó, Zsolt , Taguchi, Chihiro , Taji, Dima , Takahashi, Yuta , Tamburini, Fabio , Tan, Mary Ann C. , Tanaka, Takaaki , Tanaya, Dipta , Tavoni, Mirko , Tella, Samson , Tellier, Isabelle , Testori, Marinella , Thomas, Guillaume , Tonelli, Sara , Torga, Liisi , Toska, Marsida , Trosterud, Trond , Trukhina, Anna , Tsarfaty, Reut , Türk, Utku , Tyers, Francis , Uematsu, Sumire , Untilov, Roman , Urešová, Zdeňka , Uria, Larraitz , Uszkoreit, Hans , Utka, Andrius , Vagnoni, Elena , Vajjala, Sowmya , van der Goot, Rob , Vanhove, Martine , van Niekerk, Daniel , van Noord, Gertjan , Varga, Viktor , Vedenina, Uliana , Villemonte de la Clergerie, Eric , Vincze, Veronika , Vlasova, Natalia , Wakasa, Aya , Wallenberg, Joel C. , Wallin, Lars , Walsh, Abigail , Wang, Jing Xian , Washington, Jonathan North , Wendt, Maximilan , Widmer, Paul , Wigderson, Shira , Wijono, Sri Hartati , Williams, Seyi , Wirén, Mats , Wittern, Christian , Woldemariam, Tsegay , Wong, Tak-sum , Wróblewska, Alina , Yako, Mary , Yamashita, Kayo , Yamazaki, Naoki , Yan, Chunxiao , Yasuoka, Koichi , Yavrumyan, Marat M. , Yenice, Arife Betül , Yıldız, Olcay Taner , Yu, Zhuoran , Yuliawati, Arlisa , Žabokrtský, Zdeněk , Zahra, Shorouq , Zeldes, Amir , Zhou, He , Zhu, Hanzhi , Zhuravleva, Anna , and Ziane, Rayan
Publisher:
Universal Dependencies Consortium
Type:
text and corpus
Subject:
treebank , dependency , syntax , morphology , harmonized annotation , interset , universal tagset , and stanford dependencies
Language:
Ancient Greek (to 1453) , Arabic , Basque , Bulgarian , Croatian , Czech , Danish , Dutch , English , Estonian , Finnish , French , German , Gothic , Modern Greek (1453-) , Hebrew , Hindi , Hungarian , Indonesian , Irish , Italian , Japanese , Latin , Norwegian , Church Slavic , Persian , Polish , Portuguese , Romanian , Slovenian , Spanish , Swedish , Tamil , Catalan , Chinese , Galician , Kazakh , Latvian , Russian , Turkish , Coptic , Sanskrit , Slovak , Ukrainian , Uighur , Vietnamese , Belarusian , Korean , Lithuanian , Urdu , Russia Buriat , Northern Kurdish , Northern Sami , Upper Sorbian , Afrikaans , Yue Chinese , Marathi , Serbian , Swedish Sign Language , Telugu , Amharic , Armenian , Breton , Faroese , Komi-Zyrian , Nigerian Pidgin , Old French (842-ca. 1400) , Tagalog , Thai , Warlpiri , Yoruba , Akkadian , Bambara , Erzya , Maltese , Welsh , Wolof , Assyrian Neo-Aramaic , Literary Chinese , Old Russian , Karelian , Mbyá Guaraní , Bhojpuri , Komi-Permyak , Livvi , Moksha , Scottish Gaelic , Skolt Sami , Swiss German , Albanian , Icelandic , Akuntsu , Apurinã , Chukot , Khunsari , Manx , Mundurukú , Nayini , Old Turkish , Soi , South Levantine Arabic , Tupinambá , Beja , Western Frisian , Guajajára , Urubú-Kaapor , Kangri , K'iche' , Low German , Makuráp , Central Siberian Yupik , Western Armenian , Bengali , Javanese , Karo (Brazil) , Ligurian , Neapolitan , Tatar , Xibe , Yakut , Ancient Hebrew , Cebuano , Guarani , Hittite , Madi , Emerillon , and Umbrian
Description:
Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development, cross-lingual learning, and parsing research from a language typology perspective. The annotation scheme is based on (universal) Stanford dependencies (de Marneffe et al., 2006, 2008, 2014), Google universal part-of-speech tags (Petrov et al., 2012), and the Interset interlingua for morphosyntactic tagsets (Zeman, 2008).
Rights:
Licence Universal Dependencies v2.10 , https://lindat.mff.cuni.cz/repository/xmlui/page/license-ud-2.10 , and PUB
Creator:
Zeman, Daniel , Nivre, Joakim , Abrams, Mitchell , Ackermann, Elia , Aepli, Noëmi , Aghaei, Hamid , Agić, Željko , Ahmadi, Amir , Ahrenberg, Lars , Ajede, Chika Kennedy , Akkurt, Salih Furkan , Aleksandravičiūtė, Gabrielė , Alfina, Ika , Algom, Avner , Alzetta, Chiara , Andersen, Erik , Antonsen, Lene , Aplonova, Katya , Aquino, Angelina , Aragon, Carolina , Aranes, Glyd , Aranzabe, Maria Jesus , Arıcan, Bilge Nas , Arnardóttir, Þórunn , Arutie, Gashaw , Arwidarasti, Jessica Naraiswari , Asahara, Masayuki , Ásgeirsdóttir, Katla , Aslan, Deniz Baran , Asmazoğlu, Cengiz , Ateyah, Luma , Atmaca, Furkan , Attia, Mohammed , Atutxa, Aitziber , Augustinus, Liesbeth , Badmaeva, Elena , Balasubramani, Keerthana , Ballesteros, Miguel , Banerjee, Esha , Bank, Sebastian , Barbu Mititelu, Verginica , Barkarson, Starkaður , Basile, Rodolfo , Basmov, Victoria , Batchelor, Colin , Bauer, John , Bedir, Seyyit Talha , Belieni, Juan , Bengoetxea, Kepa , Ben Moshe, Yifat , Berk, Gözde , Berzak, Yevgeni , Bhat, Irshad Ahmad , Bhat, Riyaz Ahmad , Biagetti, Erica , Bick, Eckhard , Bielinskienė, Agnė , Bjarnadóttir, Kristín , Blokland, Rogier , Bobicev, Victoria , Boizou, Loïc , Borges Völker, Emanuel , Börstell, Carl , Bosco, Cristina , Bouma, Gosse , Bowman, Sam , Boyd, Adriane , Braggaar, Anouck , Brokaitė, Kristina , Burchardt, Aljoscha , Candito, Marie , Caron, Bernard , Caron, Gauthier , Cassidy, Lauren , Castro, Maria Clara , Cavalcanti, Tatiana , Cebiroğlu Eryiğit, Gülşen , Cecchini, Flavio Massimiliano , Celano, Giuseppe G. A. , Čéplö, Slavomír , Cesur, Neslihan , Cetin, Savas , Çetinoğlu, Özlem , Chalub, Fabricio , Chamila, Liyanage , Chauhan, Shweta , Chi, Ethan , Chika, Taishi , Cho, Yongseok , Choi, Jinho , Chun, Jayeol , Chung, Juyeon , Cignarella, Alessandra T. , Cinková, Silvie , Collomb, Aurélie , Çöltekin, Çağrı , Connor, Miriam , Corbetta, Daniela , Courtin, Marine , Cristescu, Mihaela , Daniel, Philemon , Davidson, Elizabeth , de Alencar, Leonel Figueiredo , Dehouck, Mathieu , de Laurentiis, Martina , de Marneffe, Marie-Catherine , de Paiva, Valeria , Derin, Mehmet Oguz , de Souza, Elvis , Diaz de Ilarraza, Arantza , Dickerson, Carly , Dinakaramani, Arawinda , Di Nuovo, Elisa , Dione, Bamba , Dirix, Peter , Dobrovoljc, Kaja , Dozat, Timothy , Droganova, Kira , Dwivedi, Puneet , Ebert, Christian , Eckhoff, Hanne , Eiche, Sandra , Eli, Marhaba , Elkahky, Ali , Ephrem, Binyam , Erina, Olga , Erjavec, Tomaž , Etienne, Aline , Evelyn, Wograine , Facundes, Sidney , Farkas, Richárd , Favero, Federica , Ferdaousi, Jannatul , Fernanda, Marília , Fernandez Alcalde, Hector , Foster, Jennifer , Freitas, Cláudia , Fujita, Kazunori , Gajdošová, Katarína , Galbraith, Daniel , Gamba, Federica , Garcia, Marcos , Gärdenfors, Moa , Garza, Sebastian , Gerardi, Fabrício Ferraz , Gerdes, Kim , Ginter, Filip , Godoy, Gustavo , Goenaga, Iakes , Gojenola, Koldo , Gökırmak, Memduh , Goldberg, Yoav , Gómez Guinovart, Xavier , González Saavedra, Berta , Griciūtė, Bernadeta , Grioni, Matias , Grobol, Loïc , Grūzītis, Normunds , Guillaume, Bruno , Guillot-Barbance, Céline , Güngör, Tunga , Habash, Nizar , Hafsteinsson, Hinrik , Hajič, Jan , Hajič jr., Jan , Hämäläinen, Mika , Hà Mỹ, Linh , Han, Na-Rae , Hanifmuti, Muhammad Yudistira , Harada, Takahiro , Hardwick, Sam , Harris, Kim , Haug, Dag , Heinecke, Johannes , Hellwig, Oliver , Hennig, Felix , Hladká, Barbora , Hlaváčová, Jaroslava , Hociung, Florinel , Hohle, Petter , Huerta Mendez, Marivel , Hwang, Jena , Ikeda, Takumi , Ingason, Anton Karl , Ion, Radu , Irimia, Elena , Ishola, Ọlájídé , Islamaj, Artan , Ito, Kaoru , Jannat, Siratun , Jelínek, Tomáš , Jha, Apoorva , Jiang, Katharine , Johannsen, Anders , Jónsdóttir, Hildur , Jørgensen, Fredrik , Juutinen, Markus , Kaşıkara, Hüner , Kaasen, Andre , Kabaeva, Nadezhda , Kahane, Sylvain , Kanayama, Hiroshi , Kanerva, Jenna , Kara, Neslihan , Karahóǧa, Ritván , Katz, Boris , Kayadelen, Tolga , Kengatharaiyer, Sarveswaran , Kenney, Jessica , Kettnerová, Václava , Kirchner, Jesse , Klementieva, Elena , Klyachko, Elena , Köhn, Arne , Köksal, Abdullatif , Kopacewicz, Kamil , Korkiakangas, Timo , Köse, Mehmet , Koshevoy, Alexey , Kotsyba, Natalia , Kovalevskaitė, Jolanta , Krek, Simon , Krishnamurthy, Parameswari , Kübler, Sandra , Kuqi, Adrian , Kuyrukçu, Oğuzhan , Kuzgun, Aslı , Kwak, Sookyoung , Laippala, Veronika , Lam, Lucia , Lambertino, Lorenzo , Lando, Tatiana , Larasati, Septina Dian , Lavrentiev, Alexei , Lee, John , Lê Hồng, Phương , Lenci, Alessandro , Lertpradit, Saran , Leung, Herman , Levina, Maria , Li, Cheuk Ying , Li, Josie , Li, Keying , Li, Yixuan , Li, Yuan , Lim, KyungTae , Lima Padovani, Bruna , Lindén, Krister , Ljubešić, Nikola , Loginova, Olga , Lusito, Stefano , Luthfi, Andry , Luukko, Mikko , Lyashevskaya, Olga , Lynn, Teresa , Macketanz, Vivien , Mahamdi, Menel , Maillard, Jean , Makarchuk, Ilya , Makazhanov, Aibek , Mandl, Michael , Manning, Christopher , Manurung, Ruli , Marşan, Büşra , Mărănduc, Cătălina , Mareček, David , Marheinecke, Katrin , Markantonatou, Stella , Martínez Alonso, Héctor , Martín Rodríguez, Lorena , Martins, André , Mašek, Jan , Matsuda, Hiroshi , Matsumoto, Yuji , Mazzei, Alessandro , McDonald, Ryan , McGuinness, Sarah , Mendonça, Gustavo , Merzhevich, Tatiana , Miekka, Niko , Mischenkova, Karina , Misirpashayeva, Margarita , Missilä, Anna , Mititelu, Cătălin , Mitrofan, Maria , Miyao, Yusuke , Mojiri Foroushani, AmirHossein , Molnár, Judit , Moloodi, Amirsaeid , Montemagni, Simonetta , More, Amir , Moreno Romero, Laura , Moretti, Giovanni , Mori, Keiko Sophie , Mori, Shinsuke , Morioka, Tomohiko , Moro, Shigeki , Mortensen, Bjartur , Moskalevskyi, Bohdan , Muischnek, Kadri , Munro, Robert , Murawaki, Yugo , Müürisep, Kaili , Nainwani, Pinkey , Nakhlé, Mariam , Navarro Horñiacek, Juan Ignacio , Nedoluzhko, Anna , Nešpore-Bērzkalne, Gunta , Nevaci, Manuela , Nguyễn Thị, Lương , Nguyễn Thị Minh, Huyền , Nikaido, Yoshihiro , Nikolaev, Vitaly , Nitisaroj, Rattima , Nourian, Alireza , Nurmi, Hanna , Ojala, Stina , Ojha, Atul Kr. , Óladóttir, Hulda , Olúòkun, Adédayọ̀ , Omura, Mai , Onwuegbuzia, Emeka , Ordan, Noam , Osenova, Petya , Östling, Robert , Øvrelid, Lilja , Özateş, Şaziye Betül , Özçelik, Merve , Özgür, Arzucan , Öztürk Başaran, Balkız , Paccosi, Teresa , Palmero Aprosio, Alessio , Panova, Anastasia , Park, Hyunji Hayley , Partanen, Niko , Pascual, Elena , Passarotti, Marco , Patejuk, Agnieszka , Paulino-Passos, Guilherme , Pedonese, Giulia , Peljak-Łapińska, Angelika , Peng, Siyao , Perez, Cenel-Augusto , Perkova, Natalia , Perrier, Guy , Petrov, Slav , Petrova, Daria , Peverelli, Andrea , Phelan, Jason , Piitulainen, Jussi , Pintucci, Rodrigo , Pirinen, Tommi A , Pitler, Emily , Plamada, Magdalena , Plank, Barbara , Poibeau, Thierry , Ponomareva, Larisa , Popel, Martin , Pretkalniņa, Lauma , Prévost, Sophie , Prokopidis, Prokopis , Przepiórkowski, Adam , Pugh, Robert , Puolakainen, Tiina , Pyysalo, Sampo , Qi, Peng , Rääbis, Andriela , Rademaker, Alexandre , Rahoman, Mizanur , Rama, Taraka , Ramasamy, Loganathan , Ramisch, Carlos , Rashel, Fam , Rasooli, Mohammad Sadegh , Ravishankar, Vinit , Real, Livy , Rebeja, Petru , Reddy, Siva , Regnault, Mathilde , Rehm, Georg , Riabov, Ivan , Rießler, Michael , Rimkutė, Erika , Rinaldi, Larissa , Rituma, Laura , Rizqiyah, Putri , Rocha, Luisa , Rögnvaldsson, Eiríkur , Roksandic, Ivan , Romanenko, Mykhailo , Rosa, Rudolf , Roșca, Valentin , Rovati, Davide , Rozonoyer, Ben , Rudina, Olga , Rueter, Jack , Rúnarsson, Kristján , Sadde, Shoval , Safari, Pegah , Sagot, Benoît , Sahala, Aleksi , Saleh, Shadi , Salomoni, Alessio , Samardžić, Tanja , Samson, Stephanie , Sanguinetti, Manuela , Sanıyar, Ezgi , Särg, Dage , Sartor, Marta , Sasaki, Mitsuya , Saulīte, Baiba , Sawanakunanon, Yanin , Saxena, Shefali , Scannell, Kevin , Scarlata, Salvatore , Schneider, Nathan , Schuster, Sebastian , Schwartz, Lane , Seddah, Djamé , Seeker, Wolfgang , Seraji, Mojgan , Shahzadi, Syeda , Shen, Mo , Shimada, Atsuko , Shirasu, Hiroyuki , Shishkina, Yana , Shohibussirri, Muh , Shvedova, Maria , Siewert, Janine , Sigurðsson, Einar Freyr , Silva, João Ricardo , Silveira, Aline , Silveira, Natalia , Simi, Maria , Simionescu, Radu , Simkó, Katalin , Šimková, Mária , Símonarson, Haukur Barri , Simov, Kiril , Sitchinava, Dmitri , Skachedubova, Maria , Smith, Aaron , Soares-Bastos, Isabela , Sonnenhauser, Barbara , Sourov, Shafi , Spadine, Carolyn , Sprugnoli, Rachele , Stamou, Vivian , Steingrímsson, Steinþór , Stella, Antonio , Stephen, Abishek , Straka, Milan , Strickland, Emmett , Strnadová, Jana , Suhr, Alane , Sulestio, Yogi Lesmana , Sulubacak, Umut , Suzuki, Shingo , Swanson, Daniel , Szántó, Zsolt , Taguchi, Chihiro , Taji, Dima , Takahashi, Yuta , Tamburini, Fabio , Tan, Mary Ann C. , Tanaka, Takaaki , Tanaya, Dipta , Tavoni, Mirko , Tella, Samson , Tellier, Isabelle , Testori, Marinella , Thomas, Guillaume , Tonelli, Sara , Torga, Liisi , Toska, Marsida , Trosterud, Trond , Trukhina, Anna , Tsarfaty, Reut , Türk, Utku , Tyers, Francis , Þórðarson, Sveinbjörn , Þorsteinsson, Vilhjálmur , Uematsu, Sumire , Untilov, Roman , Urešová, Zdeňka , Uria, Larraitz , Uszkoreit, Hans , Utka, Andrius , Vagnoni, Elena , Vajjala, Sowmya , van der Goot, Rob , Vanhove, Martine , van Niekerk, Daniel , van Noord, Gertjan , Varga, Viktor , Vedenina, Uliana , Venturi, Giulia , Villemonte de la Clergerie, Eric , Vincze, Veronika , Vlasova, Natalia , Wakasa, Aya , Wallenberg, Joel C. , Wallin, Lars , Walsh, Abigail , Wang, Jing Xian , Washington, Jonathan North , Wendt, Maximilan , Widmer, Paul , Wigderson, Shira , Wijono, Sri Hartati , Wille, Vanessa Berwanger , Williams, Seyi , Wirén, Mats , Wittern, Christian , Woldemariam, Tsegay , Wong, Tak-sum , Wróblewska, Alina , Yako, Mary , Yamashita, Kayo , Yamazaki, Naoki , Yan, Chunxiao , Yasuoka, Koichi , Yavrumyan, Marat M. , Yenice, Arife Betül , Yıldız, Olcay Taner , Yu, Zhuoran , Yuliawati, Arlisa , Žabokrtský, Zdeněk , Zahra, Shorouq , Zeldes, Amir , Zhou, He , Zhu, Hanzhi , Zhuravleva, Anna , and Ziane, Rayan
Publisher:
Universal Dependencies Consortium
Type:
text and corpus
Subject:
treebank , dependency , syntax , morphology , harmonized annotation , interset , universal tagset , and stanford dependencies
Language:
Ancient Greek (to 1453) , Arabic , Basque , Bulgarian , Croatian , Czech , Danish , Dutch , English , Estonian , Finnish , French , German , Gothic , Modern Greek (1453-) , Hebrew , Hindi , Hungarian , Indonesian , Irish , Italian , Japanese , Latin , Norwegian , Church Slavic , Persian , Polish , Portuguese , Romanian , Slovenian , Spanish , Swedish , Tamil , Catalan , Chinese , Galician , Kazakh , Latvian , Russian , Turkish , Coptic , Sanskrit , Slovak , Ukrainian , Uighur , Vietnamese , Belarusian , Korean , Lithuanian , Urdu , Russia Buriat , Northern Kurdish , Northern Sami , Upper Sorbian , Afrikaans , Yue Chinese , Marathi , Serbian , Swedish Sign Language , Telugu , Amharic , Armenian , Breton , Faroese , Komi-Zyrian , Nigerian Pidgin , Old French (842-ca. 1400) , Tagalog , Thai , Warlpiri , Yoruba , Akkadian , Bambara , Erzya , Maltese , Welsh , Wolof , Assyrian Neo-Aramaic , Literary Chinese , Old Russian , Karelian , Mbyá Guaraní , Bhojpuri , Komi-Permyak , Livvi , Moksha , Scottish Gaelic , Skolt Sami , Swiss German , Albanian , Icelandic , Akuntsu , Apurinã , Chukot , Khunsari , Manx , Mundurukú , Nayini , Old Turkish , Soi , South Levantine Arabic , Tupinambá , Beja , Western Frisian , Guajajára , Urubú-Kaapor , Kangri , K'iche' , Low German , Makuráp , Central Siberian Yupik , Western Armenian , Bengali , Javanese , Karo (Brazil) , Ligurian , Neapolitan , Tatar , Xibe , Yakut , Ancient Hebrew , Cebuano , Guarani , Hittite , Madi , Emerillon , Umbrian , Abaza , Gheg Albanian , Malayalam , Nhengatu , Sinhala , Zacatlán-Ahuacatlán-Tepetzintla Nahuatl , Xavánte , and Saya
Description:
Universal Dependencies is a project that seeks to develop cross-linguistically consistent treebank annotation for many languages, with the goal of facilitating multilingual parser development, cross-lingual learning, and parsing research from a language typology perspective. The annotation scheme is based on (universal) Stanford dependencies (de Marneffe et al., 2006, 2008, 2014), Google universal part-of-speech tags (Petrov et al., 2012), and the Interset interlingua for morphosyntactic tagsets (Zeman, 2008).
Rights:
Licence Universal Dependencies v2.11 , https://lindat.mff.cuni.cz/repository/xmlui/page/license-ud-2.11 , and PUB