Skip to search
Skip to main content
Skip to first result
Search
Search Results
Type:
corpus
Language:
Arabic , Danish , Dutch , English , German , Modern Greek (1453-) , Italian , Japanese , Korean , Portuguese , Russian , Spanish , and Turkish
Description:
Large set of subtitles available for download in multiple languages. Can be used as parallel corpus.
Rights:
Not specified
Publisher:
Center for Sprogteknologi, University of Copenhagen
Type:
toolService
Language:
Danish , Dutch , English , German , Modern Greek (1453-) , Icelandic , Norwegian , Russian , Slovenian , and Swedish
Description:
1) Fully automatic rule based lemmatization of inflected languages 2) Fully automatic training of lemmatization rules based on full form-lemma list
Rights:
Not specified
Type:
corpus
Language:
Modern Greek (1453-)
Description:
70K words, Non-validated sentence segmentation. Non-validated POS tagging, Manual annotation of syntactic dependencies and dependency labels, Manual annotation of semantic roles, Manual annotation of events based on a shallow domain specific ontology (only for a 31K words subset of GDT)
Rights:
Not specified
Publisher:
Institute for Language and Speech Processing
Format:
application/octet-stream
Type:
corpus
Language:
Modern Greek (1453-)
Description:
General language corpus of standard Modern Greek; 47 MWs
Rights:
Not specified
Type:
lexicalConceptualResource
Language:
Bulgarian , English , Modern Greek (1453-) , Serbian , and Slovenian
Description:
17357 terms, XML
Rights:
Not specified
Type:
corpus
Language:
English , French , and Modern Greek (1453-)
Description:
Multilingual (EN, EL, FR); multimodal (Video, Text); parallel (EN, EL, FR subtitles); comparable (transcripts, subtitles); 120 hours
Rights:
Not specified
Publisher:
Center for Reading Research, Ghent University
Type:
lexicalConceptualResource
Language:
Chinese , Dutch , English , German , Modern Greek (1453-) , and Spanish
Rights:
Not specified
Publisher:
University of Stuttgart
Type:
toolService
Subject:
POS tagger
Language:
Bulgarian , Dutch , English , French , German , Modern Greek (1453-) , Italian , Portuguese , Russian , Spanish , and Swahili (macrolanguage)
Description:
A part-of-speech tagger and lemmatizer for several languages.
Rights:
Not specified