Original context has metadata only: true / Type: corpus - LINDAT/CLARIAH-CZ Catalog Search Results

Start Over Type corpus Original context has metadata only true

271. SENIE

Publisher:: Department of Baltic Languages, University of Latvia and Institute of Mathematics and Computer Science, University of Latvia
Format:: application/octet-stream
Type:: corpus
Subject:: diachronic corpus
Language:: Latvian
Description:: Diachronic Corpus of Early Written Latvian Texts (16-18th c.). > 1 mill. running words (work is on-going). The main data are ecclesiastical texts, secular texts (laws, fiction) and some first bilingual (Latvian-German) dictionaries. A KWIC-based concordancer, as well as inverse vocabulary, frequency lists and word lists are provided. Some source facsimiles are available.
Rights:: Not specified

272. SenTube

Publisher:: Machine Learning and NLP group at Trento
Type:: corpus
Subject:: sentiment analysis
Language:: English and Italian
Description:: Sentiment analysis of Youtube videos with joint models of text and speech
Rights:: Not specified

273. Shallow syntactically disambiguated corpus

Type:: corpus
Language:: Estonian
Description:: written general; 300 000 words; local tagset (POS, syntactic functions)
Rights:: Not specified

274. Slovene Dependency Treebank

Type:: corpus
Language:: Slovenian
Description:: 3,000 sentences, analytical structure (PDT)
Rights:: Not specified

275. Sophie Parallel Treebank

Type:: corpus
Language:: Estonian
Description:: 200 sentences, TIGER-XML
Rights:: Not specified

276. Speech, Thought and Writing Presentation Corpus

Publisher:: Lancaster University
Format:: text/plain
Type:: corpus
Language:: English
Description:: A corpus of approximately 260,000 words of modern British narrative texts representing three text types (fiction, newpapers, biography) with detailed annotation for all forms of speech, thought and writing presentation which occur in the corpus. Available via OTA.
Rights:: Not specified

277. SpeechDat-Car databases

Type:: corpus
Language:: Danish, Dutch, English, Finnish, French, German, Modern Greek (1453-), Italian, and Spanish
Description:: 9 speech databases for training and testing multilingual speech recognition applications in the car environment. Contains parallel 4 channel in-car recordings and a GSM channel. Contains interesting phonetically rich material. All orthographically transcribed. Speaker information included for gender, age, accent. Including pronunciation lexicon.
Rights:: Not specified

278. SpeechDat-East databases

Type:: corpus
Subject:: These databases serve as an important resource for the performance of voice driven teleservice systems in practical implementations
Language:: Czech, Hungarian, Polish, Russian, and Slovak
Description:: 5 telephone databases recorded over the PSTN. Contains interesting phonetically rich material. All orthographically transcribed. Speaker information included for gender, age, accent. Including pronunciation lexicon.
Rights:: Not specified

279. Speecon databases

Type:: corpus
Language:: Czech, Danish, Dutch, English, Finnish, French, German, Hungarian, Italian, Polish, Portuguese, Russian, Spanish, Swedish, Turkish, Chinese, Hebrew, Japanese, Korean, and Thai
Description:: 28 speech databases containing broadband recordings from 550 adults and 50 children per language. Contains interesting phonetically rich material. All orthographically transcribed. Speaker information included for gender, age, accent. Including pronunciation lexicon.
Rights:: Not specified

280. Språkbanken (Swedish Language Bank)

Type:: corpus
Language:: Faroese, Icelandic, Spanish, and Swedish
Description:: Mainly written Swedish corpora (all time periods except Runic Swedish; various genres, including learner corpora) and lexicons; some non-Swedish corpora (Faroese, Old Icelandic, Latin, Spanish); Swedish corpora (appr. 200 MW); Swedish lexicons (appr. 220,000 entries total); non-Swedish corpora (appr. 15 MW
Rights:: Not specified

« Previous
Next »
1
2
…
24
25
26
27
28
29
30
31
32
…
38
39

271. SENIE

272. SenTube

273. Shallow syntactically disambiguated corpus

274. Slovene Dependency Treebank

275. Sophie Parallel Treebank

276. Speech, Thought and Writing Presentation Corpus

277. SpeechDat-Car databases

278. SpeechDat-East databases

279. Speecon databases

280. Språkbanken (Swedish Language Bank)

Limit your search

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Search

Search Constraints

Search Results

Limit your search

Contributor

Show values starting with

Coverage

Show values starting with

Creator

Show values starting with

Format

Language

Show values starting with

Publisher

Show values starting with

Rights

Show values starting with

Subject

Show values starting with

Type

Date

Original context has metadata only

Harvested from