1 - 9 of 9
Number of results to display per page
Search Results
2. Corpus CLUVI
- Publisher:
- TALG Research Group (University of Vigo)
- Type:
- corpus
- Language:
- Basque, Catalan, English, French, Galician, German, Portuguese, and Spanish
- Description:
- Parallel corpus, 22 million words
- Rights:
- Not specified
3. CorpusExplorer
- Creator:
- Rüdiger, Jan Oliver
- Publisher:
- Jan Oliver Rüdiger
- Type:
- tool and toolService
- Subject:
- Corpus Linguisitics, NLP, conll, tei, XML, nlp, Natural Language Processing, linguistics, Linguistics, Computational Linguistics, corpus processing, tagger, POS tagger, lemmatization, text cleaning, CommonCrawl, epub, JSON, Twitter, Pandoc, Wikipedia, digital data, DTA, DSpin, MySQL, ElasticSearch, TextGrid, text corpora, TigerXML, and WeblichtXML
- Language:
- German, English, French, Italian, Dutch, Spanish, Polish, Arabic, Chinese, and Portuguese
- Description:
- Software for corpus linguists and text/data mining enthusiasts. The CorpusExplorer combines over 45 interactive visualizations under a user-friendly interface. Routine tasks such as text acquisition, cleaning or tagging are completely automated. The simple interface supports the use in university teaching and leads users/students to fast and substantial results. The CorpusExplorer is open for many standards (XML, CSV, JSON, R, etc.) and also offers its own software development kit (SDK). Source code available at https://github.com/notesjor/corpusexplorer2.0
- Rights:
- Not specified
4. JRC-Acquis
- Publisher:
- Joint Research Centre of the EU
- Type:
- corpus
- Language:
- Bulgarian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Modern Greek (1453-), Hungarian, Italian, Latvian, Maltese, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, and Swedish
- Description:
- The largest parallel corpus, contains EU law, the Acquis Communautaire in 22 languages.
- Rights:
- Not specified
5. Neologismos económicos en las lenguas románicas a través de la prensa
- Publisher:
- Institut Universitari de Lingüística Aplicada, Universitat Pompeu Fabra
- Type:
- lexicalConceptualResource
- Subject:
- terminology database
- Language:
- Catalan, French, Galician, Italian, Portuguese, Romanian, and Spanish
- Description:
- Multilingual terminological resource containing 3.875 entries from the Economics, Finance and Banking domains.
- Rights:
- Not specified
6. PALIC
- Publisher:
- Institut Universitari de Lingüística Aplicada, Universitat Pompeu Fabra
- Type:
- toolService
- Language:
- Catalan, French, Portuguese, and Spanish
- Description:
- A package of tools for the processing of the Corpus Tècnic in Catalan and Spanish. It includes a preprocessor, a PoSTagger and a linguistic disambiguator.
- Rights:
- Not specified
7. Project Gutenberg
- Type:
- corpus
- Language:
- Danish, Dutch, English, Finnish, French, German, Italian, Latin, Portuguese, Russian, Spanish, Swedish, and Telugu
- Description:
- Possibility to download or to browse free electronic books; Angebot: Download von und Online-Zugang zu frei verfügbaren E-Books; deutschsprachige Literatur stellt nur einen Teilbereich der verfügbaren E-Books dar
- Rights:
- Not specified
8. Speecon databases
- Type:
- corpus
- Language:
- Czech, Danish, Dutch, English, Finnish, French, German, Hungarian, Italian, Polish, Portuguese, Russian, Spanish, Swedish, Turkish, Chinese, Hebrew, Japanese, Korean, and Thai
- Description:
- 28 speech databases containing broadband recordings from 550 adults and 50 children per language. Contains interesting phonetically rich material. All orthographically transcribed. Speaker information included for gender, age, accent. Including pronunciation lexicon.
- Rights:
- Not specified
9. TreeTagger
- Publisher:
- University of Stuttgart
- Type:
- toolService
- Subject:
- POS tagger
- Language:
- Bulgarian, Dutch, English, French, German, Modern Greek (1453-), Italian, Portuguese, Russian, Spanish, and Swahili (macrolanguage)
- Description:
- A part-of-speech tagger and lemmatizer for several languages.
- Rights:
- Not specified