1 - 10 of 10
Number of results to display per page
Search Results
2. CorpusExplorer
- Creator:
- Rüdiger, Jan Oliver
- Publisher:
- Jan Oliver Rüdiger
- Type:
- tool and toolService
- Subject:
- Corpus Linguisitics, NLP, conll, tei, XML, nlp, Natural Language Processing, linguistics, Linguistics, Computational Linguistics, corpus processing, tagger, POS tagger, lemmatization, text cleaning, CommonCrawl, epub, JSON, Twitter, Pandoc, Wikipedia, digital data, DTA, DSpin, MySQL, ElasticSearch, TextGrid, text corpora, TigerXML, and WeblichtXML
- Language:
- German, English, French, Italian, Dutch, Spanish, Polish, Arabic, Chinese, and Portuguese
- Description:
- Software for corpus linguists and text/data mining enthusiasts. The CorpusExplorer combines over 45 interactive visualizations under a user-friendly interface. Routine tasks such as text acquisition, cleaning or tagging are completely automated. The simple interface supports the use in university teaching and leads users/students to fast and substantial results. The CorpusExplorer is open for many standards (XML, CSV, JSON, R, etc.) and also offers its own software development kit (SDK). Source code available at https://github.com/notesjor/corpusexplorer2.0
- Rights:
- Not specified
3. JRC-Acquis
- Publisher:
- Joint Research Centre of the EU
- Type:
- corpus
- Language:
- Bulgarian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Modern Greek (1453-), Hungarian, Italian, Latvian, Maltese, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, and Swedish
- Description:
- The largest parallel corpus, contains EU law, the Acquis Communautaire in 22 languages.
- Rights:
- Not specified
4. L1 & L2 Acquisition Marzena Watorek French Project
- Publisher:
- Max Planck Institute for Psycholinguistics
- Type:
- corpus
- Subject:
- language acquisition corpus
- Language:
- French and Polish
- Description:
- Language Acquisition corpus
- Rights:
- Not specified
5. L2 Acquisition P-Moll Norbert Dittmar
- Publisher:
- Max Planck Institute for Psycholinguistics
- Type:
- corpus
- Language:
- German, Italian, and Polish
- Description:
- Language Acquisition corpus
- Rights:
- Not specified
6. Morfeusz
- Publisher:
- Institute of Computer Science, Polish Academy of Sciences
- Type:
- toolService
- Language:
- Polish
- Description:
- Morfeusz is a morphological analyser (not stemmer, not tagger) for Polish, withouth a guesser - so it's a morphological dictionary of a kind. Note it's a library, not a ready program. There exist modules developed by external authors, allowing to use Morfeusz in Java and Python.
- Rights:
- Not specified
7. National Corpus of Polish
- Publisher:
- Shared initiative of Institute of Computer Science at Polish Academy of Sciences (IPI PAN), Institute of Computer Science, Polish Academy of Sciences, Institute of Polish Language at the Polish Academy of Sciences, Polish Scientific Publishers PWN, and Department of Computational and Corpus Linguistics at the University of Łódź
- Type:
- corpus
- Language:
- Polish
- Description:
- In (advanced) preparation: a reference corpus of Polish language containing hundreds millions of words.
- Rights:
- Not specified
8. SpeechDat-East databases
- Type:
- corpus
- Subject:
- These databases serve as an important resource for the performance of voice driven teleservice systems in practical implementations
- Language:
- Czech, Hungarian, Polish, Russian, and Slovak
- Description:
- 5 telephone databases recorded over the PSTN. Contains interesting phonetically rich material. All orthographically transcribed. Speaker information included for gender, age, accent. Including pronunciation lexicon.
- Rights:
- Not specified
9. Speecon databases
- Type:
- corpus
- Language:
- Czech, Danish, Dutch, English, Finnish, French, German, Hungarian, Italian, Polish, Portuguese, Russian, Spanish, Swedish, Turkish, Chinese, Hebrew, Japanese, Korean, and Thai
- Description:
- 28 speech databases containing broadband recordings from 550 adults and 50 children per language. Contains interesting phonetically rich material. All orthographically transcribed. Speaker information included for gender, age, accent. Including pronunciation lexicon.
- Rights:
- Not specified
10. Świgra
- Publisher:
- Institute of Computer Science, Polish Academy of Sciences
- Type:
- toolService
- Language:
- Polish
- Description:
- Implementation of Świdziński's formal grammar of Polish. Requires a parser (Birnam parser available as a separate tool) and a morphological analyser (no free analyser for Polish; Morfeusz can be used with restrictions - in this case the whole set is available for academic and non-commercial use only).
- Rights:
- Not specified