The Corpus NGT is a collection of data from deaf signers using Sign Language of the Netherlands (NGT). The data consist of recordings with multiple synchronised video cameras, accompanied by gloss and translation annotations.
The Prague family of annotated corpora has a new member, the Czech Academic Corpus version 2.0 (CAC 2.0). CAC 2.0 consists of 650,000 words from various 1970s and 1980s newspapers, magazines and radio and television broadcast transcripts manually annotated for morphology and syntax.
Database of three inter-related early Irish glossaries. The texts, compiled from the eighth century, comprise several thousand headwords followed by entries that can range from single word explanations to whole narratives running to several pages.
Seven French L2 corpora. Digital sound files and related transcripts formatted using CHILDES software. The database currently contains over 4000 files (sound files, transcripts and morphosyntactically tagged transcripts). .