Glossa is a web-based system for corpus search and results management. It comes with built-in support for CLARIN federated content search as well as corpora encoded with the IMS Corpus Workbench. It also has a plugin architecture that enables other search engines to be used once a wrapper has been created.Glossa can be freely downloaded and installed on the user's server. It currently supports only monolignual written corpora, but support for multilingual corpora is under development, as well as support for spoken corpora with audio, video and maps.
70K words, Non-validated sentence segmentation. Non-validated POS tagging, Manual annotation of syntactic dependencies and dependency labels, Manual annotation of semantic roles, Manual annotation of events based on a shallow domain specific ontology (only for a 31K words subset of GDT)
Collection of orthographically transcribed audio recorded speech, mainly from East Anglia and the South-West, with a minor collection from Lancashire. The recordings were made in the 1970s and the 1980s by Finnish postgraduates.
The Helsinki Finite-State Transducer software is intended for the implementation of morphological analysers and other tools which are based on weighted and unweigted finite-state transducer technology. The feasibility of the HFST toolkit has been demonstrated by full-fledged open source implementations of Finnish, Swedish, English, French and Northern Sámi lexicons.