WikiSpeech is a content management system for the web-based creation of speech databases for the development of spoken language technology and basic research. Its main features are full support for the typical recording, annotation and project administration workflow, easy editing of the speech content, plus a fully localizable user interface. For the creation of a new speech database, it is only necessary to open a new project within WikiSpeech, provide a link to any static project information pages and upload the prompt material to be presented to the speakers. Recordings and annotation are performed via the WWW in a platform independent manner on any Java compatible computer. WikiSpeech currently has been localized to four languages: German, English, Romanian and Russian.
Wmatrix is a corpus comparison and annotation tool. It is web based and incorporates the CLAWS POS tagger and the USAS semantic tagger for English. It also generates frequency lists, concordances, key words and key semantic domains by comparative frequency profiling.
Angabe von Wort, Anzahl, Häufigkeitsklasse, Beschreibung, Sachgebiet, Morphologie, Relationen zu anderen Wörtern (z. B. Synonymie), Links zu anderen Wörtern, Dornseiff-Bedeutungsgruppen, Beispielen (u.a. entnommen aus spiegel.de, sueddeutsche.de), signifikanten Kookkurenzen, signifikanten linken und rechten Nachbarn
Tool for designing and performing Word Sense Disambiguation (WSD) experiments. Current version (prototype) facilitates the construction and evaluation of WSD methods in the supervised Machine Learning paradigm.
Xaira is the current name for a new version of SARA, the text searching software originally developed at OUCS for use with the British National Corpus. This new version has been entirely re-written as a general purpose XML search engine, which will operate on any corpus of well-formed XML documents. It is however best used with TEI-conformant documents.