« Previous |
1 - 10 of 22
|
Next »
Number of results to display per page
Search Results
2. CorPipe 23 multilingual CorefUD 1.1 model (corpipe23-corefud1.1-231206)
- Creator:
- Straka, Milan
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- tool and toolService
- Subject:
- coreference resolution, CorPipe, and CorefUD
- Language:
- Catalan, Czech, German, English, Spanish, French, Hungarian, Lithuanian, Norwegian Bokmål, Norwegian Nynorsk, Polish, Russian, and Turkish
- Description:
- The `corpipe23-corefud1.1-231206` is a `mT5-large`-based multilingual model for coreference resolution usable in CorPipe 23 (https://github.com/ufal/crac2023-corpipe). It is released under the CC BY-NC-SA 4.0 license. The model is language agnostic (no _corpus id_ on input), so it can be used to predict coreference in any `mT5` language (for zero-shot evaluation, see the paper). However, note that the empty nodes must be present already on input, they are not predicted (the same settings as in the CRAC23 shared task).
- Rights:
- Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB
3. CorpusExplorer
- Creator:
- Rüdiger, Jan Oliver
- Publisher:
- Jan Oliver Rüdiger
- Type:
- tool and toolService
- Subject:
- Corpus Linguisitics, NLP, conll, tei, XML, nlp, Natural Language Processing, linguistics, Linguistics, Computational Linguistics, corpus processing, tagger, POS tagger, lemmatization, text cleaning, CommonCrawl, epub, JSON, Twitter, Pandoc, Wikipedia, digital data, DTA, DSpin, MySQL, ElasticSearch, TextGrid, text corpora, TigerXML, and WeblichtXML
- Language:
- German, English, French, Italian, Dutch, Spanish, Polish, Arabic, Chinese, and Portuguese
- Description:
- Software for corpus linguists and text/data mining enthusiasts. The CorpusExplorer combines over 45 interactive visualizations under a user-friendly interface. Routine tasks such as text acquisition, cleaning or tagging are completely automated. The simple interface supports the use in university teaching and leads users/students to fast and substantial results. The CorpusExplorer is open for many standards (XML, CSV, JSON, R, etc.) and also offers its own software development kit (SDK). Source code available at https://github.com/notesjor/corpusexplorer2.0
- Rights:
- Not specified
4. CST's lemmatiser
- Publisher:
- Center for Sprogteknologi, University of Copenhagen
- Type:
- toolService
- Language:
- Danish, Dutch, English, German, Modern Greek (1453-), Icelandic, Norwegian, Russian, Slovenian, and Swedish
- Description:
- 1) Fully automatic rule based lemmatization of inflected languages 2) Fully automatic training of lemmatization rules based on full form-lemma list
- Rights:
- Not specified
5. Cyril Belica : Kookkurrenzdatenbank CCDB
- Publisher:
- Institut für Deutsche Sprache
- Type:
- toolService
- Language:
- German
- Description:
- A co-occurrence database, developed by the Institut fuer Deutsche Sprache, for research in the field of collocation analysis in modern German. The database holds over 200,000 analysed words that can be browsed or searched and shown in context.
- Rights:
- Not specified
6. Gesprächanalytisches Informationssystem (GAIS)
- Publisher:
- Institut für Deutsche Sprache
- Type:
- toolService
- Language:
- German
- Description:
- web-based information system on scientific community (news, events, persons, job market, mailing list, database on research projects and corpora, bibliography, glossary and links) and recording equipment/software; disciplinary scope: research on conversation and discourse analysis and spoken language
- Rights:
- Not specified
7. KAMOKO-Digitalizer
- Creator:
- Rüdiger, Jan Oliver
- Publisher:
- Rüdiger, Jan Oliver
- Type:
- tool and toolService
- Subject:
- learner corpus, corpus, and annotation
- Language:
- German
- Description:
- This editor was developed especially for the needs of the KAMOKO project (https://lindat.mff.cuni.cz/repository/xmlui/handle/11372/LRT-3261). The editor allows the quick entry of example sentences and sentence variants as well as the corresponding speaker ratings.
- Rights:
- Affero General Public License 3 (AGPL-3.0), http://opensource.org/licenses/AGPL-3.0, and PUB
8. Lingua::Interset 2.026
- Creator:
- Zeman, Daniel
- Publisher:
- Charles University, Faculty of Mathematics and Physics
- Type:
- tool and toolService
- Subject:
- morphology, part of speech, conversion, and tagset
- Language:
- Arabic, Bulgarian, Bengali, Catalan, Czech, Danish, German, Modern Greek (1453-), English, Spanish, Estonian, Basque, Persian, Finnish, Ancient Greek (to 1453), Hebrew, Hindi, Croatian, Japanese, Multiple languages, and Portuguese
- Description:
- Lingua::Interset is a universal morphosyntactic feature set to which all tagsets of all corpora/languages can be mapped. Version 2.026 covers 37 different tagsets of 21 languages. Limited support of the older drivers for other languages (which are not included in this package but are available for download elsewhere) is also available; these will be fully ported to Interset 2 in future. Interset is implemented as Perl libraries. It is also available via CPAN.
- Rights:
- Artistic License (Perl) 1.0, http://opensource.org/licenses/Artistic-Perl-1.0, and PUB
9. MCSQ Translation Models (en-de) (v1.0)
- Creator:
- Variš, Dušan
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- tool and toolService
- Subject:
- machine translation, transformer, and neural machine translation
- Language:
- English and German
- Description:
- En-De translation models, exported via TensorFlow Serving, available in the Lindat translation service (https://lindat.mff.cuni.cz/services/translation/). The models were trained using the MCSQ social surveys dataset (available at https://repo.clarino.uib.no/xmlui/bitstream/handle/11509/142/mcsq_v3.zip). Their main use should be in-domain translation of social surveys. Models are compatible with Tensor2tensor version 1.6.6. For details about the model training (data, model hyper-parameters), please contact the archive maintainer. Evaluation on MCSQ test set (BLEU): en->de: 67.5 (train: genuine in-domain MCSQ data only) de->en: 75.0 (train: additional in-domain backtranslated MCSQ data) (Evaluated using multeval: https://github.com/jhclark/multeval)
- Rights:
- Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB
10. Moses Web Demo
- Creator:
- Bojar, Ondřej, Cífka, Ondřej, Pecina, Pavel, and Tamchyna, Aleš
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- toolService and tool
- Subject:
- machine translation, web service, and demo
- Language:
- Czech, English, Russian, Ukrainian, French, and German
- Description:
- An interactive web demo of selected ÚFAL MT systems. and FP7-ICT-2011-7-288487 MosesCore
- Rights:
- Not specified
- « Previous
- Next »
- 1
- 2
- 3