i.a. collection of old herbal books, old cookery books and texts on the history of German language in print media; u.a. eine Sammlung von alten Kräuterbüchern, alten Kochbüchern und Texten zur Geschichte der deutschen Pressesprache
The ACL RD-TEC 2.0 has been developed with the aim of providing a benchmark for the evaluation of methods for terminology extraction and classification as well as entity recognition tasks based on specialised text from the computational linguistics domain. This release of the corpus consists of 300 abstracts from articles in the ACL Anthology Reference Corpus, published between 1978--2006. In these abstracts, terms (i.e., single or multi-word lexical units with a specialised meaning) are manually annotated. In addition to their boundaries in running text, annotated terms are classified into one of the seven categories method, tool, language resource (LR), LR product, model, measures and measurements, and other. To assess the quality of the annotations and to determine the difficulty of this task, more than 171 of the abstracts are annotated twice, independently, by each of the two annotators. In total, 6,818 terms are identified and annotated, resulting in a specialised vocabulary made of 3,318 lexical forms, mapped to 3,471 concepts.
The audio collection and the written texts. Now it contains approximately 2000 hours of digitalised and more than 2000 not digitalised audio recordings; 400,000 cards with information on dialectal words, morphology, syntax, etc.; transcripts and notes.
An annotated corpus of literary Ancient Greek sourced from the Perseus Canonical Greek Lit repository (https://github.com/PerseusDL/canonical-greekLit), “The Little Sailing” digital library (http://www.mikrosapoplous.gr/en/texts1en.html), and the Bibliotheca Augustana digital library (http://www.hs-augsburg.de/~harsch/augustana.html#gr).
The corpus consists of 820 texts spanning between the beginnings of the AG literary tradition (Homer) and the fifth century AD, and it counts 10,206,421 words.
In addition to referring to this resource, please use the following citation when citing the corpus:
Vatri, A., & McGillivray, B. (2018). The Diorisis Ancient Greek Corpus, Research Data Journal for the Humanities and Social Sciences, 3(1), 55-65. doi: https://doi.org/10.1163/24523666-01000013
The Dutch Song Database (Nederlandse Liederenbank in Dutch) contains more than 125,000 songs in the Dutch and Flemish language, from the Middle Ages through the twentieth century.
The database currently contains about 1 million dialectal linguistic evidences of the project "The Franconian Dictionary" (German: Das Fränkische Wörterbuch), each of which lemmatized, annotated, and linked to the original questionnaire. The database is work in progress, so there will be more data available regularly.
The Franconian Dictionary was initiated by the Munich office of the Bavarian Dictionary project, sending questionnaires for a dialect survey in Franconia. In the wake of this survey an office in Erlangen was established in 1933 (see link below for more information).
During the course of 90 years thousands of volunteers helped to compile a considerable collection of vernacular examples of usage, drawn from the Bavarian districts of Upper, Middle and Lower Frankonia. For the most part they represent the East Franconian dialect, to the lesser extent also Rhine-Franconian, Swabian and North-Bavarian vernaculars. Between 2007 and 2008 a small selection of the research results was published in three editions of one printed volume by Eberhard Wagner and Alfred Klepsch: “Handwörterbuch von Bayerisch-Franken” (see link below for more information).
Since 2012 the Franconian Dictionary, a project of the Bavarian Academy of Sciences and Humanities, has been entrusted to the Friedrich-Alexander-University in Erlangen and Nuremberg (FAU). The project is supervised by Prof. Dr. Mechthild Habermann, Chair of the Faculty of German Linguistics at the FAU.
For detailed information, please see http://www.wbf.badw.de/en/the-project.html and http://www.wbf.badw.de/en/wbf-digital.html