Skip to search
Skip to main content
Skip to first result
Search
Search Results
Type:
corpus
Language:
Arabic , Danish , Dutch , English , German , Modern Greek (1453-) , Italian , Japanese , Korean , Portuguese , Russian , Spanish , and Turkish
Description:
Large set of subtitles available for download in multiple languages. Can be used as parallel corpus.
Rights:
Not specified
Publisher:
Center for Sprogteknologi, University of Copenhagen
Type:
toolService
Language:
Danish , Dutch , English , German , Modern Greek (1453-) , Icelandic , Norwegian , Russian , Slovenian , and Swedish
Description:
1) Fully automatic rule based lemmatization of inflected languages 2) Fully automatic training of lemmatization rules based on full form-lemma list
Rights:
Not specified
Type:
lexicalConceptualResource
Language:
Danish
Rights:
Not specified
Type:
lexicalConceptualResource
Language:
Danish
Description:
80.000 entries, flat, tab-separated file
Rights:
Not specified
Publisher:
Joint Research Centre of the EU
Type:
corpus
Language:
Bulgarian , Czech , Danish , Dutch , English , Estonian , Finnish , French , German , Modern Greek (1453-) , Hungarian , Italian , Latvian , Maltese , Norwegian , Polish , Portuguese , Romanian , Slovak , Slovenian , Spanish , and Swedish
Description:
The largest parallel corpus, contains EU law, the Acquis Communautaire in 22 languages.
Rights:
Not specified
Type:
corpus
Language:
Danish
Description:
written, general language
Rights:
Not specified
Type:
corpus
Language:
Danish
Description:
written, general language; 22 million tokens
Rights:
Not specified
Type:
corpus
Language:
Danish
Description:
written, general language; pos, manually checked; 250000 tokens
Rights:
Not specified
Type:
corpus
Language:
Danish , Dutch , English , Finnish , French , German , Italian , Latin , Portuguese , Russian , Spanish , Swedish , and Telugu
Description:
Possibility to download or to browse free electronic books; Angebot: Download von und Online-Zugang zu frei verfügbaren E-Books; deutschsprachige Literatur stellt nur einen Teilbereich der verfügbaren E-Books dar
Rights:
Not specified
Type:
corpus
Language:
Danish , Dutch , English , Finnish , French , German , Modern Greek (1453-) , Italian , and Spanish
Description:
9 speech databases for training and testing multilingual speech recognition applications in the car environment. Contains parallel 4 channel in-car recordings and a GSM channel. Contains interesting phonetically rich material. All orthographically transcribed. Speaker information included for gender, age, accent. Including pronunciation lexicon.
Rights:
Not specified