bi-directional parallel corpus based on an open-ended collection of Portuguese-English and English-Portuguese source-texts and translations. Searchable via the IMS Corpus Query Processor and the DISPARA interface
Parallel treebanks with annotation of syntax, discourse, coreference, morphology, and semantics. Version 3 also includes the Danish Dependency Treebank (version 1) and the Danish-English Parallel Dependency Treebank (version 2).
Oral corpus containing 166 narratives in English elicited by means of Labovian techniques. Participants from the UK (England, Wales, Scotland), Ireland, USA, Australia and South Africa.
domain specific corpus (Law, Economy, Computing, Medicine and Environment as well as a contrastive corpus from the press); EN 3.3 M tokens, SP 33 M tokens, CAT 19 M tokens; EAGLEs pos tagset