Wörterbuch für Redensarten, Redewendungen, idiomatische Ausdrücke, feste Wortverbindungen; die Suchergebnisse werden jeweils nach den vier Dimensionen Redensart – Erläuterung – Beispiele – Ergänzungen angezeigt
The C4 corpus is a joined effort of the project Digitales Wörterbuch der deutschen Sprache (DWDS), the Austrian Academy Corpus (AAC), the Korpus Südtirol and the Schweizer Textkorpus (CHTK). The Corpus is composed of corpora of all four partner institutions.
1) Finds repeated sequences of words in documents (repetitiveness checker) 2) Finds common sequences of words in several documents (version comparison) A sequence of words consists of minimally two words. There is no upper limit of the number of words in a sequence, but sequences do not transgress sentence delimiters. There are several weight functions to choose from, each defining "good" sequences in a different way, based on word frequency, sequence lenght and number of repetitions.
Possibility to download the Ridges herbology corpus as a whole or parts of it; Möglichkeit zum Download des Ridges Herbology-Korpus als Ganzes oder einzelner Teildokumente