A digital collection of Frisian books, scientific magazines and newspaper articles, which can be used to investigate various aspects of Frisian culture including language and literature. The corpus contains more than 25 million words
A corpus of dialect speech from Tyneside in North-East England. digitized audio, standard orthographic transcription, phonetic transcription, and part-of-speech tagged
Ted Pedersen's Ngram Statistics Package (used to identify word Ngrams that appear in large corpora using standard tests of association such as Fisher's exact test, the log likelihood ratio, Pearson's chi-squared test, the Dice Coefficient, etc.).