Harvested from: LINDAT/CLARIAH-CZ repository - LINDAT/CLARIAH-CZ Catalog Search Results

Start Over Harvested from LINDAT/CLARIAH-CZ repository

631. Estonian Text-to-Speech Synthesiser for the Blind

Publisher:: Laboratory of Phonetics and Speech Technology, Tallinn University of Technology
Type:: toolService
Rights:: Not specified

632. Estonian-English parallel corpus

Type:: corpus
Language:: English and Estonian
Description:: written EU legislation; 5 mio words Est, 7.8 mio words Eng; Sentence-aligned
Rights:: Not specified

633. Estonian-Latvian dictionary

Publisher:: Tilde
Format:: application/octet-stream
Type:: lexicalConceptualResource
Language:: Estonian and Latvian
Description:: Estonian-Latvian dictionary is based on dictionary of K.Aben and suplemented with new lexical entries of modern lexica, ca. 26 000 lexical entries
Rights:: Not specified

634. Etalon 1.0

Creator:: Skoumalová, Hana
Publisher:: Charles University, Faculty of Arts, Institute of Theoretical and Computational Linguistics
Type:: text and corpus
Subject:: annotated corpus and morphological annotation
Language:: Czech
Description:: Etalon is a manually annotated corpus of contemporary Czech. The corpus contains 1,885,589 words (2,265,722 tokens) and is annotated in the same way as SYN2020 of the Czech National Corpus. The corpus includes fiction (ca 24%), professional and scientific literature (ca 40%) and newspapers (ca 36%). The corpus is provided in a vertical format, where sentence boundaries are marked with a blank line. Every word form is written on a separate line, followed by five tab-separated attributes: syntactic word, lemma, sublemma, tag and verbtag. The texts are shuffled in random chunks of 100 words at maximum (respecting sentence boundaries).
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

635. Etymological Reference Database

Publisher:: The Research Institute for the Languages of Finland
Type:: toolService
Language:: Finnish
Rights:: Not specified

636. Europarl QTLeap WSD/NED corpus

Creator:: Agirre, Eneko, Branco, António, Popel, Martin, and Simov, Kiril
Publisher:: University of the Basque Country, UPV/EHU, Faculty of Science, Univeristy of Lisbon, FCUL, Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL), and Bulgarian Academy of Sciences, IICT-BAS
Type:: text and corpus
Subject:: annotated corpus and multilingual
Language:: Basque, Bulgarian, Czech, English, Portuguese, and Spanish
Description:: This corpora is part of Deliverable 5.5 of the European Commission project QTLeap FP7-ICT-2013.4.1-610516 (http://qtleap.eu). The texts are sentences from the Europarl parallel corpus (Koehn, 2005). We selected the monolingual sentences from parallel corpora for the following pairs: Bulgarian-English, Czech-English, Portuguese-English and Spanish-English. The English corpus is comprised by the English side of the Spanish-English corpus. Basque is not in Europarl. In addition, it contains the Basque and English sides of the GNOME corpus. The texts have been automatically annotated with NLP tools, including Word Sense Disambiguation, Named Entity Disambiguation and Coreference resolution. Please check deliverable D5.6 in http://qtleap.eu/deliverables for more information.
Rights:: Creative Commons - Attribution 4.0 International (CC BY 4.0), http://creativecommons.org/licenses/by/4.0/, and PUB

637. Europarl: European Parliament Proceedings Parallel Corpus 1996-2003

Type:: corpus
Language:: Portuguese
Description:: Parallel corpus
Rights:: Not specified

638. Eurotermbank

Publisher:: Tilde and Eurotermbank consortium
Format:: application/octet-stream
Type:: lexicalConceptualResource
Language:: English, Estonian, French, German, Hungarian, Latvian, and Lithuanian
Description:: EuroTermBank is single access point to European multilingual terminology resources. It contains more than 1.9 million terms over 25 languages
Rights:: Not specified

639. EUSTACE : Edinburgh University speech timing archive and corpus of English

Publisher:: Centre for Speech Technology Research, University of Edinburgh
Type:: corpus
Language:: English
Description:: Speech corpus comprising 4608 spoken sentences recorded for speech timing research. The complete archive, available for downloading, includes a structured list of the sentences, the speech recordings and the label files, plus full documentation.
Rights:: Not specified

640. EvaLatin 2020 models for UDPipe 2 (2020-08-31)

Creator:: Straka, Milan
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: tool and toolService
Subject:: POS tagger, lemmatization, and tagger
Language:: Latin
Description:: POS Tagger and Lemmatizer models for EvaLatin2020 data (https://github.com/CIRCSE/LT4HALA). The model documentation including performance can be found at https://ufal.mff.cuni.cz/udpipe/2/models#evalatin20_models . To use these models, you need UDPipe version at least 2.0, which you can download from https://ufal.mff.cuni.cz/udpipe/2 .
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

« Previous
Next »
1
2
…
60
61
62
63
64
65
66
67
68
…
228
229

631. Estonian Text-to-Speech Synthesiser for the Blind

632. Estonian-English parallel corpus

633. Estonian-Latvian dictionary

634. Etalon 1.0

635. Etymological Reference Database

636. Europarl QTLeap WSD/NED corpus

637. Europarl: European Parliament Proceedings Parallel Corpus 1996-2003

638. Eurotermbank

639. EUSTACE : Edinburgh University speech timing archive and corpus of English

640. EvaLatin 2020 models for UDPipe 2 (2020-08-31)

Limit your search

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Search

Search Constraints

Search Results

Limit your search

Contributor

Show values starting with

Coverage

Show values starting with

Creator

Show values starting with

Format

Language

Show values starting with

Publisher

Show values starting with

Rights

Show values starting with

Subject

Show values starting with

Type

Show values starting with

Date

Original context has metadata only

Harvested from