Rights: Not specified - LINDAT/CLARIAH-CZ Catalog Search Results

441. The Corpus for Bokmål Lexicography LBK

Publisher:: Department of Linguistics and Nordic Studies, University of Oslo
Type:: corpus
Description:: 100 mill. words from newspapers, novels, magazines etc. LBK is a representative, weighted corpus made for lexicographic purposes.
Rights:: Not specified

442. The Dialect Archive

Type:: corpus
Language:: Lithuanian
Description:: The audio collection and the written texts. Now it contains approximately 2000 hours of digitalised and more than 2000 not digitalised audio recordings; 400,000 cards with information on dialectal words, morphology, syntax, etc.; transcripts and notes.
Rights:: Not specified

443. The Diorisis Ancient Greek Corpus

Creator:: Vatri, Alessandro and McGillivray, Barbara
Publisher:: Figshare
Type:: text and corpus
Subject:: annotated corpus, ancient world, lemmatization, and part of speech
Language:: Ancient Greek (to 1453)
Description:: An annotated corpus of literary Ancient Greek sourced from the Perseus Canonical Greek Lit repository (https://github.com/PerseusDL/canonical-greekLit), “The Little Sailing” digital library (http://www.mikrosapoplous.gr/en/texts1en.html), and the Bibliotheca Augustana digital library (http://www.hs-augsburg.de/~harsch/augustana.html#gr). The corpus consists of 820 texts spanning between the beginnings of the AG literary tradition (Homer) and the fifth century AD, and it counts 10,206,421 words. In addition to referring to this resource, please use the following citation when citing the corpus: Vatri, A., & McGillivray, B. (2018). The Diorisis Ancient Greek Corpus, Research Data Journal for the Humanities and Social Sciences, 3(1), 55-65. doi: https://doi.org/10.1163/24523666-01000013
Rights:: Not specified

444. The Dutch Song Database

Publisher:: Meertens Institute KNAW The Netherlands
Type:: corpus
Language:: Dutch
Description:: The Dutch Song Database (Nederlandse Liederenbank in Dutch) contains more than 125,000 songs in the Dutch and Flemish language, from the Middle Ages through the twentieth century.
Rights:: Not specified

445. The Internet Language Reference Book

Creator:: Šmerk, Pavel, Pravdová, Markéta, Beneš, Martin, Černá, Anna, Hlaváčková, Dana, Chromý, Jan, Konečná, Hana, Kopecký, Jakub, Mžourková, Hana, Pala, Karel, Prokšová, Hana, Prošek, Martin, Smejkalová, Kamila, Svobodová, Ivana, and Uhlířová, Ludmila
Publisher:: Institute of Czech Language, Czech Academy of Sciences and Masaryk University, NLP Centre
Type:: toolService and service
Subject:: literature
Language:: Czech and English
Description:: The ILRB has been created by two cooperating teams - by the team of the Institute of Czech Language, Czech Academy of Sciences and the team of the NLP Centre at the Faculty of Informatics, Masaryk University (2004-2008). The tool consists of two sections: wordlist and reference (explanatory) one. Comments and remarks are welcome and should be send to the address poradna@ujc.cas.cz. 1. Wordlist section It contains more than 60 000 dictionary entries and is based on the glossary of the School Rules of Czech Orthography, the Dictionary of the Literary Czech and selected entries from the New Dictionary of Words of Foreign Origin and Dictionary of Neologisms. The entries typically include information that is asked about frequently by the users. Also inflectional forms of the particular words forms are offered in the form of tables thanks to the morphological analyzer ajka created at the Faculty of Informatics, MU. The dictionary part is linked to the explanatory one through the hypertext links. 2. Reference section It comprises the explanations about linguistic phenomena described in the Rules of Czech Orthography and contemporary Czech grammars, frequently and repeatedly asked by the users turning to the Linguistic Advisory Line in the Institute of Czech Language. In the offered explanations some typical spelling problems are dealt with including the appropriate recommendations. The ILRB is regularly updated and completed, new expressions are added and made more precise. and Academy of Sciences of the Czech Republic in project 1ET200610406 and Ministry of Education, Youth and Sports in projects LM2010013, LC536 and 2C06009.
Rights:: Not specified

446. The JRC-Acquis Multilingual Parallel Corpus

Type:: corpus
Language:: Portuguese
Description:: Law
Rights:: Not specified

447. The Karjalainen Corpus

Publisher:: University of Joensuu
Type:: corpus
Language:: Finnish
Description:: computer corpus of Finnish newspaper texts of the 1990s (newspaper Karjalainen, Joensuu)
Rights:: Not specified

448. The National Certificates corpus

Publisher:: Centre for Applied Language Studies, University of Jyväskylä
Type:: corpus
Language:: English, Finnish, French, German, Italian, Russian, Spanish, and Swedish
Description:: The NC test results, background information, speaking and writing performances in 9 foreign / second languages. A web-based data base (html files).
Rights:: Not specified

449. The Norwegian Newspaper Corpus

Publisher:: Unifob AS
Type:: corpus
Language:: Norwegian
Description:: Dynamic, web-based newspaper corpus; 700 000 000 ws and growing; multitagged
Rights:: Not specified

450. The Swedish Parole corpus

Type:: corpus
Language:: Swedish
Description:: mixed-genre (press, fiction, pop science, public information); appr. 19 MW; POS tags (in CWB format)
Rights:: Not specified

441. The Corpus for Bokmål Lexicography LBK

442. The Dialect Archive

443. The Diorisis Ancient Greek Corpus

444. The Dutch Song Database

445. The Internet Language Reference Book

446. The JRC-Acquis Multilingual Parallel Corpus

447. The Karjalainen Corpus

448. The National Certificates corpus

449. The Norwegian Newspaper Corpus

450. The Swedish Parole corpus

Limit your search

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Search

Search Constraints

Search Results

Limit your search

Contributor

Show values starting with

Coverage

Show values starting with

Creator

Show values starting with

Format

Language

Show values starting with

Publisher

Show values starting with

Rights

Subject

Show values starting with

Type

Original context has metadata only

Harvested from