The electronic version of the book “Corpus PAAU 1992: Descriptive Studies, Texts and Vocabulary” includes the texts that have been object of analysis in this project as well as the vocabulary lists that make up the Corpus 92.
domain specific corpus (Law, Economy, Computing, Medicine and Environment as well as a contrastive corpus from the press); EN 3.3 M tokens, SP 33 M tokens, CAT 19 M tokens; EAGLEs pos tagset
Software for corpus linguists and text/data mining enthusiasts. The CorpusExplorer combines over 45 interactive visualizations under a user-friendly interface. Routine tasks such as text acquisition, cleaning or tagging are completely automated. The simple interface supports the use in university teaching and leads users/students to fast and substantial results. The CorpusExplorer is open for many standards (XML, CSV, JSON, R, etc.) and also offers its own software development kit (SDK).
Source code available at https://github.com/notesjor/corpusexplorer2.0