Written German from 1920-39. 500,000 tokens, 392 texts. POS and lemma, TEI XML. Part of Das digitale Wörterbuch der deutschen Sprache der 20. Jahrhunderts
Parallel corpus, 3,297,283 words.
The idea was to create a small parallel corpus which would enable to work with entire texts in translation analysis rather then short extracts. At the same time it aimed at acquiring experience that could be used in creating a larger parallel corpus of English and Czech in the future.
Although the main part of work has been completed -- and the aims of the KACENKA grant met -- we keep improving and enlarging KACENKA gradually. Currently, it has the size of 3,297,283 words (out of which, 1,689,513 have been acquired by means of scanning).
Most of the English texts for KACENKA have been retrieved from the Internet resources. The rest -- and nearly all the Czech texts -- had to be scanned with the use of an OCR programme.
KACENKA is stored on a single CD-ROM; its use is limited by copyright restrictions.
3300 texts written by pupils for the final in Norwegian language in 1998, 1999, 2000 and 2001. The database also includes associated grades and other background material.
Diachronic corpus with focus on annotation and lemmatization of verbal categories; diachrones Korpus mit Fokus auf Annotation und Lemmatisierung von Verbalkategorien
Philosophical texts of the 18th century: Full text of the authoritative "Akademie-Ausgabe" (excluding most footnotes and editorial notes) and reference texts like A.G. Baumgarten's "Metaphysica".
KinOath Kinship Archiver is a kinship application with the primary goal of connecting kinship data with archived data, such as audio, video or written resources while also being closely integrated with the archive software such as Arbil. Beyond this primary goal it is designed to be flexible and culturally nonspecific, such that culturally different social structures can equally be represented. Kin type strings are used throughout the application for constructing and searching data sets. The representation of kin terms is also integrated into the application allowing comparative diagrams of kin terms. Graphical representation of the data is an important part of the application and the diagrams produced are intended to very flexible and of publishable quality.