Rights: Not specified - LINDAT/CLARIAH-CZ Catalog Search Results

Start Over Rights Not specified Date 2005 to 2006

1. Anglos-Saxon charters

Publisher:: King's College London
Format:: application/tei+xml
Type:: corpus
Language:: English
Description:: Charters written in Anglo-Saxon England before A.D. 900, marked-up in TEI XML. Browsable online.
Rights:: Not specified

2. Berliner Wendekorpus

Publisher:: Berlin-Brandenburg Academy of Sciences and Humanities
Format:: application/tei+xml
Type:: corpus
Language:: German
Description:: Transcribed narrative interviews with people from East and West Berlin about the events of November 9. 282,000 tokens. TEI XML, lemma and POS. Normalized version also available.
Rights:: Not specified

3. British academic spoken English (BASE) corpus

Publisher:: Coventry University, University of Reading, University of Warwick
Format:: application/tei+xml
Type:: corpus
Language:: English
Description:: Transcribed recordings of 160 lectures and 39 seminars held in university departments. Four broad disciplinary groups, 1,644,942 tokens in total.
Rights:: Not specified

4. CAST corpus (Computer-Aided Summarisation Tool)

Publisher:: Research Group in Computational Linguistics, University of Wolverhampton
Type:: corpus
Language:: English
Description:: Sentences annotated for important units of text for summarisation. 145,473 words / 6584 sentences
Rights:: Not specified

5. Collection of Latvian proverbs

Publisher:: Archives of Latvian Folklore, Institute of Literature, Folklore and Art, University of Latvia and Institute of Mathematics and Computer Science, University of Latvia
Type:: corpus
Language:: Latvian
Description:: Latvian proverbs collected by Archives of Latvian Folklore (~ 20 000 items)
Rights:: Not specified

6. English-Luganda Parallel Corpus

Publisher:: Center for Dutch Language and Speech, University of Antwerp
Type:: corpus
Language:: English
Description:: Bible. Word-alligned corpus
Rights:: Not specified

7. Eurotermbank

Publisher:: Tilde and Eurotermbank consortium
Format:: application/octet-stream
Type:: lexicalConceptualResource
Language:: English, Estonian, French, German, Hungarian, Latvian, and Lithuanian
Description:: EuroTermBank is single access point to European multilingual terminology resources. It contains more than 1.9 million terms over 25 languages
Rights:: Not specified

8. Hungarian Historical Corpus

Publisher:: Academy of Sciences
Format:: application/xml
Type:: corpus
Language:: Hungarian
Description:: Containing 27 million running words the Hungarian Historical Corpus provides a valuable basis for research on the history of words of Hungarian between the second half of the 18th century and 2000.
Rights:: Not specified

9. Hungarian National Corpus

Publisher:: Academy of Sciences
Format:: application/xml
Type:: corpus
Subject:: synchronic corpus
Language:: Hungarian
Description:: Written general synchronic reference corpus; 190m tokens; POS annotated XML
Rights:: Not specified

10. Latvian-Lithuanian Web dictionary

Publisher:: Tilde
Format:: application/octet-stream
Type:: lexicalConceptualResource
Language:: Latvian and Lithuanian
Description:: The dictionary is based on Latvian-Lithuanian dictionary by A. Butkus, ~43 000 entries
Rights:: Not specified