A parsed corpus of spoken English. Ca 400,000 words from ICE-GB (early 1990s) and 400,000 words from the London-Lund Corpus (late 1960s-early 1980s). The orthographic transcriptions have been normalised and annotated.
Historical dictionary of the Scottish language as written and spoken by lowland Scots in Scotland and Ulster from the 12th century onward. Over eighty thousand full-word entries.
Phonological neighborhood density is known to influence lexical access, speech production as well as perception processes. Lexical competition is thought to be the central concept from which the neighborhood effect emanates: highly competitive neighborhoods are characterized by large degrees of phonemic co-activation, which can delay speech recognition and facilitate speech production. The present study investigates phonetic learning in English as a foreign language in relation to phonological neighborhood density and onset density to see whether dense or sparse neighborhoods are more conducive to the incorporation of novel phonetic detail. In addition, the effect of voice-contrasted minimal pairs (bat-pat) is explored. Results indicate that sparser neighborhoods with weaker lexical competition provide the most optimal phonological environment for phonetic learning. Moreover, novel phonetic details are incorporated faster in neighborhoods without minimal pairs. Results indicate that lexical competition plays a role in the dissemination of phonetic updates in the lexicon of foreign language learners.
The aim of the course is to introduce digital humanities and to describe various aspects of digital content processing.
The course consists of 10 lessons with video material and a PowerPoint presentation with the same content.
Every lesson contains a practical session – either a Jupyter Notebook to work in Python or a text file with a short description of the task. Most of the practical tasks consist of running the programme and analyse the results.
Although the course does not focus on programming, the code can be reused easily in individual projects.
Some experience in running Python code is desirable but not required.
The data set includes training, development and test data from the shared tasks on pronoun-focused machine translation and cross-lingual pronoun prediction from the EMNLP 2015 workshop on Discourse in Machine Translation (DiscoMT2015). The release also contains the submissions to the pronoun-focused machine translation along with the manual annotations used for the official evaluation as well as gold-standard annotations of pronoun coreference for the shared task test set.