Number of results to display per page
Search Results
232. tfidf
- Publisher:
- Institut Universitari de Lingüística Aplicada, Universitat Pompeu Fabra
- Type:
- toolService
- Description:
- It calculates the Term Frequency and the Inverse Document Frequency of a word in a given corpus (a statistical measure used to evaluate how important a word is to a document in a collection or corpus).
- Rights:
- Not specified
233. The Internet Language Reference Book
- Creator:
- Šmerk, Pavel, Pravdová, Markéta, Beneš, Martin, Černá, Anna, Hlaváčková, Dana, Chromý, Jan, Konečná, Hana, Kopecký, Jakub, Mžourková, Hana, Pala, Karel, Prokšová, Hana, Prošek, Martin, Smejkalová, Kamila, Svobodová, Ivana, and Uhlířová, Ludmila
- Publisher:
- Institute of Czech Language, Czech Academy of Sciences and Masaryk University, NLP Centre
- Type:
- toolService and service
- Subject:
- literature
- Language:
- Czech and English
- Description:
- The ILRB has been created by two cooperating teams - by the team of the Institute of Czech Language, Czech Academy of Sciences and the team of the NLP Centre at the Faculty of Informatics, Masaryk University (2004-2008). The tool consists of two sections: wordlist and reference (explanatory) one. Comments and remarks are welcome and should be send to the address poradna@ujc.cas.cz. 1. Wordlist section It contains more than 60 000 dictionary entries and is based on the glossary of the School Rules of Czech Orthography, the Dictionary of the Literary Czech and selected entries from the New Dictionary of Words of Foreign Origin and Dictionary of Neologisms. The entries typically include information that is asked about frequently by the users. Also inflectional forms of the particular words forms are offered in the form of tables thanks to the morphological analyzer ajka created at the Faculty of Informatics, MU. The dictionary part is linked to the explanatory one through the hypertext links. 2. Reference section It comprises the explanations about linguistic phenomena described in the Rules of Czech Orthography and contemporary Czech grammars, frequently and repeatedly asked by the users turning to the Linguistic Advisory Line in the Institute of Czech Language. In the offered explanations some typical spelling problems are dealt with including the appropriate recommendations. The ILRB is regularly updated and completed, new expressions are added and made more precise. and Academy of Sciences of the Czech Republic in project 1ET200610406 and Ministry of Education, Youth and Sports in projects LM2010013, LC536 and 2C06009.
- Rights:
- Not specified
234. The Model latinpipe-evalatin24-240520 for LatinPipe 2024
- Creator:
- Straka, Milan, Straková, Jana, and Gamba, Federica
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- tool and toolService
- Subject:
- LatinPipe, EvaLatin 2024, POS tagging, lemmatization, and dependency parsing
- Language:
- Latin
- Description:
- The latinpipe-evalatin24-240520 is a PhilBerta-based model for LatinPipe 2024 <https://github.com/ufal/evalatin2024-latinpipe>, performing tagging, lemmatization, and dependency parsing of Latin, based on the winning entry to the EvaLatin 2024 <https://circse.github.io/LT4HALA/2024/EvaLatin> shared task. It is released under the CC BY-NC-SA 4.0 license.
- Rights:
- Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB
235. The World Atlas of Language Structures Online
- Publisher:
- Max Planck Digital Library, http://wals.info/author
- Type:
- toolService
- Description:
- WALS is a large database of structural (phonological, grammatical, lexical) properties of languages gathered from descriptive materials (such as reference grammars) by a team of more than 40 authors (many of them the leading authorities on the subject).
- Rights:
- Creative Commons Attribution-Noncommercial-No Derivative Works 2.0 Germany - http://creativecommons.org/licenses/by-nc-nd/2.0/de/deed.en and http://wals.info/about/legal
236. THEaiTRobot 1.0
- Creator:
- Rosa, Rudolf, Dušek, Ondřej, Kocmi, Tom, Mareček, David, Musil, Tomáš, Schmidtová, Patrícia, Jurko, Dominik, Bojar, Ondřej, Hrbek, Daniel, Košťák, David, Kinská, Martina, Nováková, Marie, Doležal, Josef, and Vosecká, Klára
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL), The Švanda Theatre in Smíchov, and The Academy of Performing Arts in Prague, Theatre Faculty (DAMU)
- Type:
- tool and toolService
- Subject:
- theatre and natural language generation
- Language:
- English and Czech
- Description:
- The THEaiTRobot 1.0 tool allows the user to interactively generate scripts for individual theatre play scenes. The tool is based on GPT-2 XL generative language model, using the model without any fine-tuning, as we found that with a prompt formatted as a part of a theatre play script, the model usually generates continuation that retains the format. We encountered numerous problems when generating the script in this way. We managed to tackle some of the problems with various adjustments, but some of them remain to be solved in a future version. THEaiTRobot 1.0 was used to generate the first THEaiTRE play, "AI: Když robot píše hru" ("AI: When a robot writes a play").
- Rights:
- The MIT License (MIT), http://opensource.org/licenses/mit-license.php, and PUB
237. THEaiTRobot 2.0
- Creator:
- Rosa, Rudolf, Dušek, Ondřej, Kocmi, Tom, Mareček, David, Musil, Tomáš, Schmidtová, Patrícia, Jurko, Dominik, Bojar, Ondřej, Hrbek, Daniel, Košťák, David, Kinská, Martina, Nováková, Marie, Doležal, Josef, Vosecká, Klára, Zakhtarenko, Alisa, and Obaid, Saad
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL), The Švanda Theatre in Smíchov, and The Academy of Performing Arts in Prague, Theatre Faculty (DAMU)
- Type:
- tool and toolService
- Subject:
- theatre and natural language generation
- Language:
- English and Czech
- Description:
- The THEaiTRobot 2.0 tool allows the user to interactively generate scripts for individual theatre play scenes. The previous version of the tool (http://hdl.handle.net/11234/1-3507) was based on GPT-2 XL generative language model, using the model without any fine-tuning, as we found that with a prompt formatted as a part of a theatre play script, the model usually generates continuation that retains the format. The current version also uses vanilla GPT-2 by default, but can also instead use a GPT-2 medium model fine-tuned on theatre play scripts (as well as film and TV series scripts). Apart from the basic "flat" generation using a theatrical starting prompt and the script model, the tool also features a second, hierarchical variant, where in the first step, a play synopsis is generated from its title using a synopsis model (GPT-2 medium fine-tuned on synopses of theatre plays, as well as film, TV series and book synopses). The synopsis is then used as input for the second stage, which uses the script model. The choice of models to use is done by setting the MODEL variable in start_server.sh and start_syn_server.sh THEaiTRobot 2.0 was used to generate the second THEaiTRE play, "Permeation/Prostoupení".
- Rights:
- The MIT License (MIT), http://opensource.org/licenses/mit-license.php, and PUB
238. Tilburg Memory-Based Learner
- Publisher:
- University of Antwerp, University of Tilburg
- Type:
- toolService
- Description:
- An elegantly simple and robust machine-learning method, based on the combination of ideas from a number of MBL implementations, resulting in a useful tool for NLP research.
- Rights:
- Not specified
239. Tilde English-Latvian SMT system
- Publisher:
- Tilde
- Type:
- toolService
- Language:
- English
- Description:
- English-Latvian factored SMT system trained on different parallel texts
- Rights:
- Not specified
240. TMODS:ENG-CZE -- query translation
- Creator:
- Tamchyna, Aleš and Bojar, Ondřej
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- suiteOfTools and toolService
- Subject:
- machine translation and query translationn
- Language:
- Czech and English
- Description:
- AMALACH project component TMODS:ENG-CZE; machine translation of queries from Czech to English. This archive contains models for the Moses decoder (binarized, pruned to allow for real-time translation) and configuration files for the MTMonkey toolkit. The aim of this package is to provide a full service for Czech->English translation which can be easily utilized as a component in a larger software solution. (The required tools are freely available and an installation guide is included in the package.) The translation models were trained on CzEng 1.0 corpus and Europarl. Monolingual data for LM estimation additionally contains WMT news crawls until 2013.
- Rights:
- Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB