Harvested from: LINDAT/CLARIAH-CZ repository / Original context has metadata only: false / Type: text

31. Annotated Corpus of Czech Case Law for Segmentation Tasks

Creator:: Harašta, Jakub, Šavelka, Jaromír, Kasl, František, and Míšek, Jakub
Publisher:: Masaryk University, Brno
Type:: text and corpus
Subject:: document segmentation and legal texts
Language:: Czech
Description:: Annotated corpus of 350 decision of Czech top-tier courts (Supreme Court, Supreme Administrative Court, Constitutional Court). 280 decisions were annotated by one trained annotator and then manually adjudicated by one trained curator. 70 decisions were annotated by two trained annotators and then manually adjudicated by one trained curator. Adjudication was conducted destructively, therefore dataset contains only the correct annotations and does not contain all original annotations. Corpus was developed as training and testing material for text segmentation tasks. Dataset contains decision segmented into Header, Procedural History, Submission/Rejoinder, Court Argumentation, Footer, Footnotes, and Dissenting Opinion. Segmentation allows to treat different parts of text differently even if it contains similar linguistic or other features.
Rights:: Creative Commons - Attribution 4.0 International (CC BY 4.0), http://creativecommons.org/licenses/by/4.0/, and PUB

32. Annotation of Dramatic Situations in Theater Play Scripts

Creator:: Mareček, David, Nováková, Marie, Vosecká, Klára, Doležal, Josef, and Rosa, Rudolf
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL) and The Academy of Performing Arts in Prague, Theatre Faculty (DAMU)
Type:: text and corpus
Subject:: theatre, play script, and dramatic situation
Language:: Czech
Description:: We defined 58 dramatic situations and annotated them in 19 play scripts. Then we selected only 5 well-recognized dramatic situations and annotated further 33 play scripts. In this version of the data, we release only play scripts that can be freely distributed, which is 9 play scripts. One play is annotated independently by three annotators.
Rights:: Creative Commons - Attribution 4.0 International (CC BY 4.0), http://creativecommons.org/licenses/by/4.0/, and PUB

33. Annotation of Dramatic Situations in Theater Play Scripts (2023)

Creator:: Mareček, David, Nováková, Marie, Vosecká, Klára, Doležal, Josef, and Rosa, Rudolf
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL) and The Academy of Performing Arts in Prague, Theatre Faculty (DAMU)
Type:: text and corpus
Subject:: theatre, play script, and dramatic situation
Language:: Czech
Description:: We defined 58 dramatic situations and annotated them in 19 play scripts. Then we selected only 5 well-recognized dramatic situations and annotated further 33 play scripts. In the previous (first) version, we released 9 play scripts that could be freely distributed. In this (second) version of the data, we are adding another 10 plays for which we have obtained licenses from authors. In total, there are 19 play scripts available, and one of them is annotated three times - independently by three annotators.
Rights:: THEAITRE AI research only license, https://lindat.mff.cuni.cz/repository/xmlui/page/theaitre-license, and ACA

34. APE Shared Task WMT17: Human Post-edits Test Data DE-EN

Creator:: Turchi, Marco, Chatterjee, Rajen, and Negri, Matteo
Publisher:: Fondazione Bruno Kessler, Trento, Italy
Type:: text and corpus
Subject:: Human post-edits, machine translation, shared task, automatic post-editing, and post-editing
Language:: English
Description:: Human post-edited test sentences for the WMT 2017 Automatic post-editing task. This consists in 2,000 English sentences belonging to the IT domain and already tokenized. Source and target segments can be downloaded from: https://lindat.mff.cuni.cz/repository/xmlui/handle/11372/LRT-2132. All data is provided by the EU project QT21 (http://www.qt21.eu/).
Rights:: AGREEMENT ON THE USE OF DATA IN QT21 APE Task, https://lindat.mff.cuni.cz/repository/xmlui/page/licence-TAUS_QT21, and PUB

35. APE Shared Task WMT17: Human Post-edits Test Data EN-DE

Creator:: Turchi, Marco, Chatterjee, Rajen, and Negri, Matteo
Publisher:: Fondazione Bruno Kessler, Trento, Italy
Type:: text and corpus
Subject:: machine translation, human post-edits, shared task, automatic post-editing, and post-editing
Language:: German
Description:: Human post-edited test sentences for the WMT 2017 Automatic post-editing task. This consists in 2,000 German sentences belonging to the IT domain and already tokenized. Source and target segments can be downloaded from: https://lindat.mff.cuni.cz/repository/xmlui/handle/11372/LRT-2133. All data is provided by the EU project QT21 (http://www.qt21.eu/).
Rights:: AGREEMENT ON THE USE OF DATA IN QT21 APE Task, https://lindat.mff.cuni.cz/repository/xmlui/page/licence-TAUS_QT21, and PUB

36. APE Shared Task WMT18: Human Post-edits and References Test Data EN-DE PBSMT

Creator:: Turchi, Marco, Negri, Matteo, and Chatterjee, Rajen
Publisher:: Fondazione Bruno Kessler, Trento, Italy
Type:: text and corpus
Subject:: automatic post-editing, post-editing, phrase-based MT, and reference translation
Language:: German
Description:: Human post-edited and reference test sentences for the En-De PBSMT WMT 2018 Automatic post-editing task. This consists of 2,000 German sentences for each file belonging to the IT domain and already tokenized. All data is provided by the EU project QT21 (http://www.qt21.eu/).
Rights:: AGREEMENT ON THE USE OF DATA IN QT21 APE Task, https://lindat.mff.cuni.cz/repository/xmlui/page/licence-TAUS_QT21, and PUB

37. Arabic characters lexicon

Creator:: Namly, Driss
Publisher:: Ibtikarat team
Type:: text, lexicon, and lexicalConceptualResource
Subject:: alphabets
Language:: Arabic
Description:: A XML-based file containing all Arabic characters (letters, vowels and punctuations). Each character described with a description, different displays (isolated, at the beginning, middle and the end of a word), a codification (Unicode, others could be added later), and two transliterations (Buckwalter and wiki)
Rights:: Creative Commons - Attribution-NonCommercial 4.0 International (CC BY-NC 4.0), http://creativecommons.org/licenses/by-nc/4.0/, and PUB

38. Arabic Enclitics Lexicon

Creator:: Loukili, Taoufik
Publisher:: Ibtikarat team
Type:: text, lexicon, and lexicalConceptualResource
Subject:: Enclitics
Language:: Arabic
Description:: An XML-based file containing all Arabic enclitics
Rights:: Creative Commons - Attribution-NonCommercial 4.0 International (CC BY-NC 4.0), http://creativecommons.org/licenses/by-nc/4.0/, and PUB

39. Arabic Morphological evaluation corpus

Creator:: Jaafar, Younes
Publisher:: Ibtikarat team
Type:: text, wordList, and lexicalConceptualResource
Subject:: morphological analysis and benchmarking corpus
Language:: Arabic
Description:: An annotated corpus dedicated to the benchmark and evaluation of Arabic morphological analyzers. It consists of 100 words with all their possible analysis. The corpus contains several morphological information such as stem, pattern, root, lemma, etc.
Rights:: Creative Commons - Attribution-NonCommercial 4.0 International (CC BY-NC 4.0), http://creativecommons.org/licenses/by-nc/4.0/, and PUB

40. Arabic Particles Lexicon

Creator:: Namly, Driss
Publisher:: Ibtikarat team
Type:: text, lexicon, and lexicalConceptualResource
Subject:: particles
Language:: Arabic
Description:: An XML-based file containing Arabic particles
Rights:: Creative Commons - Attribution-NonCommercial 4.0 International (CC BY-NC 4.0), http://creativecommons.org/licenses/by-nc/4.0/, and PUB

31. Annotated Corpus of Czech Case Law for Segmentation Tasks

32. Annotation of Dramatic Situations in Theater Play Scripts

33. Annotation of Dramatic Situations in Theater Play Scripts (2023)

34. APE Shared Task WMT17: Human Post-edits Test Data DE-EN

35. APE Shared Task WMT17: Human Post-edits Test Data EN-DE

36. APE Shared Task WMT18: Human Post-edits and References Test Data EN-DE PBSMT

37. Arabic characters lexicon

38. Arabic Enclitics Lexicon

39. Arabic Morphological evaluation corpus

40. Arabic Particles Lexicon

Limit your search

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Search

Search Constraints

Search Results

Limit your search

Contributor

Show values starting with

Creator

Show values starting with

Language

Show values starting with

Publisher

Show values starting with

Rights

Show values starting with

Subject

Show values starting with

Type

Show values starting with

Date

Original context has metadata only

Harvested from