Harvested from: LINDAT/CLARIAH-CZ repository - LINDAT/CLARIAH-CZ Catalog Search Results

Start Over Harvested from LINDAT/CLARIAH-CZ repository Date Unknown

361. Estonian Frequency Dictionary

Publisher:: University of Tartu
Format:: text/plain
Type:: lexicalConceptualResource
Language:: Estonian
Description:: 10000 most frequent lemmas, 1000 most frequent word forms, based on 1 million words of journals and fiction
Rights:: Not specified

362. Estonian Reference Corpus

Publisher:: University of Tartu
Format:: application/tei+xml
Type:: corpus
Language:: Estonian
Description:: Collection of Estonian texts (divided into subcorpora); ca 175 million words; TEI
Rights:: Not specified

363. Estonian Text-to-Speech Synthesiser for the Blind

Publisher:: Laboratory of Phonetics and Speech Technology, Tallinn University of Technology
Type:: toolService
Rights:: Not specified

364. Estonian-English parallel corpus

Type:: corpus
Language:: English and Estonian
Description:: written EU legislation; 5 mio words Est, 7.8 mio words Eng; Sentence-aligned
Rights:: Not specified

365. Etalon 1.0

Creator:: Skoumalová, Hana
Publisher:: Charles University, Faculty of Arts, Institute of Theoretical and Computational Linguistics
Type:: text and corpus
Subject:: annotated corpus and morphological annotation
Language:: Czech
Description:: Etalon is a manually annotated corpus of contemporary Czech. The corpus contains 1,885,589 words (2,265,722 tokens) and is annotated in the same way as SYN2020 of the Czech National Corpus. The corpus includes fiction (ca 24%), professional and scientific literature (ca 40%) and newspapers (ca 36%). The corpus is provided in a vertical format, where sentence boundaries are marked with a blank line. Every word form is written on a separate line, followed by five tab-separated attributes: syntactic word, lemma, sublemma, tag and verbtag. The texts are shuffled in random chunks of 100 words at maximum (respecting sentence boundaries).
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

366. Etymological Reference Database

Publisher:: The Research Institute for the Languages of Finland
Type:: toolService
Language:: Finnish
Rights:: Not specified

367. Europarl: European Parliament Proceedings Parallel Corpus 1996-2003

Type:: corpus
Language:: Portuguese
Description:: Parallel corpus
Rights:: Not specified

368. EUSTACE : Edinburgh University speech timing archive and corpus of English

Publisher:: Centre for Speech Technology Research, University of Edinburgh
Type:: corpus
Language:: English
Description:: Speech corpus comprising 4608 spoken sentences recorded for speech timing research. The complete archive, available for downloading, includes a structured list of the sentences, the speech recordings and the label files, plus full documentation.
Rights:: Not specified

369. EvaLatin 2020 models for UDPipe 2 (2020-08-31)

Creator:: Straka, Milan
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: tool and toolService
Subject:: POS tagger, lemmatization, and tagger
Language:: Latin
Description:: POS Tagger and Lemmatizer models for EvaLatin2020 data (https://github.com/CIRCSE/LT4HALA). The model documentation including performance can be found at https://ufal.mff.cuni.cz/udpipe/2/models#evalatin20_models . To use these models, you need UDPipe version at least 2.0, which you can download from https://ufal.mff.cuni.cz/udpipe/2 .
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

370. EVALD 1.0

Creator:: Rysová, Kateřina, Mírovský, Jiří, Novák, Michal, and Rysová, Magdaléna
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: tool and toolService
Subject:: text coherence, discourse, automatic evaluation, and native speakers
Language:: Czech
Description:: EVALD 1.0 serves for automatic evaluation of surface coherence (cohesion) in Czech texts written by native speakers of Czech.
Rights:: BSD 2-Clause "Simplified" or "FreeBSD" license, http://opensource.org/licenses/BSD-2-Clause, and PUB

« Previous
Next »
1
2
…
33
34
35
36
37
38
39
40
41
…
112
113

361. Estonian Frequency Dictionary

362. Estonian Reference Corpus

363. Estonian Text-to-Speech Synthesiser for the Blind

364. Estonian-English parallel corpus

365. Etalon 1.0

366. Etymological Reference Database

367. Europarl: European Parliament Proceedings Parallel Corpus 1996-2003

368. EUSTACE : Edinburgh University speech timing archive and corpus of English

369. EvaLatin 2020 models for UDPipe 2 (2020-08-31)

370. EVALD 1.0

Limit your search

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Search

Search Constraints

Search Results

Limit your search

Contributor

Show values starting with

Coverage

Show values starting with

Creator

Show values starting with

Format

Language

Show values starting with

Publisher

Show values starting with

Rights

Show values starting with

Subject

Show values starting with

Type

Show values starting with

Original context has metadata only

Harvested from