« Previous |
1 - 10 of 14
|
Next »
Number of results to display per page
Search Results
2. Basic vocabulary on the Human Genome
- Publisher:
- Institut Universitari de Lingüística Aplicada, Universitat Pompeu Fabra
- Type:
- lexicalConceptualResource
- Language:
- Catalan, English, French, Galician, Italian, Portuguese, and Spanish
- Description:
- A vocabulary resulting from the cooperation of the groups of REALITER network that collects the basic terminology mostly used in texts about Genomics. It contains equivalents in English, Peninsular and Latinamerican Spanish, French, Italian, Galician, Portuguese and Catalan.
- Rights:
- Not specified
3. CorpusExplorer
- Creator:
- Rüdiger, Jan Oliver
- Publisher:
- Jan Oliver Rüdiger
- Type:
- tool and toolService
- Subject:
- Corpus Linguisitics, NLP, conll, tei, XML, nlp, Natural Language Processing, linguistics, Linguistics, Computational Linguistics, corpus processing, tagger, POS tagger, lemmatization, text cleaning, CommonCrawl, epub, JSON, Twitter, Pandoc, Wikipedia, digital data, DTA, DSpin, MySQL, ElasticSearch, TextGrid, text corpora, TigerXML, and WeblichtXML
- Language:
- German, English, French, Italian, Dutch, Spanish, Polish, Arabic, Chinese, and Portuguese
- Description:
- Software for corpus linguists and text/data mining enthusiasts. The CorpusExplorer combines over 45 interactive visualizations under a user-friendly interface. Routine tasks such as text acquisition, cleaning or tagging are completely automated. The simple interface supports the use in university teaching and leads users/students to fast and substantial results. The CorpusExplorer is open for many standards (XML, CSV, JSON, R, etc.) and also offers its own software development kit (SDK). Source code available at https://github.com/notesjor/corpusexplorer2.0
- Rights:
- Not specified
4. FreeLing
- Publisher:
- Centro de Tecnologías y Aplicaciones del Lenguaje y del Habla (TALP)
- Type:
- toolService
- Language:
- Catalan, English, Galician, Italian, Portuguese, and Welsh
- Description:
- Open source language analysis tool suite: tokenizer, stemmer/lemmatizer, named entity recognizer, chunker/segmenter, morphosyntactic tagger, syntactic tagger, corpus processer, morphological tagger, semantic tagger, analyzer, Word Sense Disambiguator.
- Rights:
- Not specified
5. JIRS
- Publisher:
- Grid and High Performance Computing Group, ITACA, Universidad Politécnica de Valencia and Universidad de Alicante
- Type:
- toolService
- Language:
- Arabic, English, French, Italian, Oromo, and Urdu
- Description:
- JIRS is a Passage Retrieval system specially suited for Question Answering. It could be adapted to others languages very easily. ask (Written Language): Information Retrieval Applications Question/Answering Environment: OS-independent Access: GPLv3
- Rights:
- Not specified
6. JRC-Acquis
- Publisher:
- Joint Research Centre of the EU
- Type:
- corpus
- Language:
- Bulgarian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Modern Greek (1453-), Hungarian, Italian, Latvian, Maltese, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, and Swedish
- Description:
- The largest parallel corpus, contains EU law, the Acquis Communautaire in 22 languages.
- Rights:
- Not specified
7. Project Gutenberg
- Type:
- corpus
- Language:
- Danish, Dutch, English, Finnish, French, German, Italian, Latin, Portuguese, Russian, Spanish, Swedish, and Telugu
- Description:
- Possibility to download or to browse free electronic books; Angebot: Download von und Online-Zugang zu frei verfügbaren E-Books; deutschsprachige Literatur stellt nur einen Teilbereich der verfügbaren E-Books dar
- Rights:
- Not specified
8. SenTube
- Publisher:
- Machine Learning and NLP group at Trento
- Type:
- corpus
- Subject:
- sentiment analysis
- Language:
- English and Italian
- Description:
- Sentiment analysis of Youtube videos with joint models of text and speech
- Rights:
- Not specified
9. SpeechDat-Car databases
- Type:
- corpus
- Language:
- Danish, Dutch, English, Finnish, French, German, Modern Greek (1453-), Italian, and Spanish
- Description:
- 9 speech databases for training and testing multilingual speech recognition applications in the car environment. Contains parallel 4 channel in-car recordings and a GSM channel. Contains interesting phonetically rich material. All orthographically transcribed. Speaker information included for gender, age, accent. Including pronunciation lexicon.
- Rights:
- Not specified
10. Speecon databases
- Type:
- corpus
- Language:
- Czech, Danish, Dutch, English, Finnish, French, German, Hungarian, Italian, Polish, Portuguese, Russian, Spanish, Swedish, Turkish, Chinese, Hebrew, Japanese, Korean, and Thai
- Description:
- 28 speech databases containing broadband recordings from 550 adults and 50 children per language. Contains interesting phonetically rich material. All orthographically transcribed. Speaker information included for gender, age, accent. Including pronunciation lexicon.
- Rights:
- Not specified