The code-switching corpus consists of 5x30-minute conversations between four speakers (i.e. a total of 20 speakers). The speakers are bilingual speakers of Papiamento (a creole langauge spoken in the Dutch Antilles) and Dutch. In the course of their free conversations, they engage in code-switching, that is, they use both languages within the same utterance in systematic ways. The corpus is fully transcribed and glossed, coded for language and word class, in ELAN.
The Netherlands Veterans Institute (VI) hosts about 250 interviews (audio) in which Dutch former military personel speak about their experiences during World War II (interviews about the years 1935-1945) and decolonisation in the Dutch East Indies (1945-1950) and Dutch New Guinea (1960-1962). In the project Living Oral History Workbench these interviews have been indexed by automatic speech recognition techniques. The list of interviews and their metadata are available at the CLARIN Center; researchers may apply to VI for access to the data.