Subject: morphological dictionary - LINDAT/CLARIAH-CZ Catalog Search Results

1. Italian Function Words

Creator:: Grella, Matteo
Publisher:: Matteo Grella
Type:: text, machineReadableDictionary, and lexicalConceptualResource
Subject:: morphological dictionary and function words
Language:: Italian
Description:: This dictionary is a curated list of Italian function words in a JSON Lines format text file, particularly useful for tasks such as POS-Tagging or Syntactic Parsing. It contains 999 single-word forms and 2501 multi-words forms. Each entry may have the following grammatical features: lemma, pos, mood, tense, person, number, gender, case, degree.
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

2. MorfFlex CZ 160310

Creator:: Hajič, Jan and Hlaváčová, Jaroslava
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text, lexicalConceptualResource, and computationalLexicon
Subject:: morphological dictionary, morphology, and Czech
Language:: Czech
Description:: Czech morphological dictionary developed originally by Jan Hajič as a spelling checker and lemmatization dictionary. Currently it contains full morphological information for each covered wordform, as well as some derivational, semantic and named entity information.
Rights:: Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0), http://creativecommons.org/licenses/by-nc-sa/3.0/, and PUB

3. MorfFlex CZ 161115

Creator:: Hajič, Jan and Hlaváčová, Jaroslava
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text, lexicalConceptualResource, and computationalLexicon
Subject:: morphological dictionary, morphology, and Czech
Language:: Czech
Description:: Czech morphological dictionary developed originally by Jan Hajič as a spelling checker and lemmatization dictionary. Currently it contains full morphological information for each covered wordform, as well as some derivational, semantic and named entity information.
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

4. MorfFlex CZ 2.0

Creator:: Hajič, Jan, Hlaváčová, Jaroslava, Mikulová, Marie, Straka, Milan, and Štěpánková, Barbora
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text, lexicalConceptualResource, and computationalLexicon
Subject:: morphological dictionary, morphology, and Czech
Language:: Czech
Description:: MorfFlex CZ 2.0 is the Czech morphological dictionary developed originally by Jan Hajič as a spelling checker and lemmatization dictionary. MorfFlex is a flat list of lemma-tag-wordform triples. For each wordform, full inflectional information is coded in a positional tag. Wordforms are organized into entries (paradigm instances or paradigms in short) according to their formal morphological behavior. The paradigm (set of wordforms) is identified by a unique lemma. Apart from traditional morphological categories, the description also contains some semantic, stylistic and derivational information. For more details see a comprehensive specification of the Czech morphological annotation http://ufal.mff.cuni.cz/techrep/tr64.pdf .
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

5. MorfFlex SK 170914

Creator:: Hajič, Jan and Hric, Jan
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text, computationalLexicon, and lexicalConceptualResource
Subject:: Slovak and morphological dictionary
Language:: Slovak
Description:: Slovak morphological dictionary modeled after the Czech one. It consists of (word form, lemma, POS tag) triples, reusing the Czech morphological system for POS tags and lemma descriptions.
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

6. MorfoCzech

Creator:: Pelegrinová, Kateřina, Elšík, Viktor, Čech, Radek, and Mačutek, Ján
Publisher:: University of Ostrava
Type:: text, lexicon, and lexicalConceptualResource
Subject:: word segmentation, morphology, and morphological dictionary
Language:: Czech
Description:: A dictionary of morphologically segmented word forms in Czech. Rules of manual segmentation are described in Pelegrinová, K., Mačutek, J., Čech, R. (2021). The Menzerath-Altmann law as the relation between lengths of words and morphemes in Czech. Jazykovedný časopis, 72, 405-414. The dictionary is based on short stories, fairy tales, letters and studies written by Karel Čapek.
Rights:: Creative Commons - Attribution 4.0 International (CC BY 4.0), http://creativecommons.org/licenses/by/4.0/, and PUB

7. MorfoCzech 1.1

Creator:: Pelegrinová, Kateřina, Elšík, Viktor, Čech, Radek, and Mačutek, Ján
Publisher:: University of Ostrava
Type:: text, lexicon, and lexicalConceptualResource
Subject:: word segmentation, morphology, and morphological dictionary
Language:: Czech
Description:: A dictionary of morphologically segmented word forms in Czech. Rules of manual segmentation are described in Pelegrinová, K., Mačutek, J., Čech, R. (2021). The Menzerath-Altmann law as the relation between lengths of words and morphemes in Czech. Jazykovedný časopis, 72, 405-414. The dictionary is based on short stories, fairy tales, letters and studies written by Karel Čapek.
Rights:: Creative Commons - Attribution 4.0 International (CC BY 4.0), http://creativecommons.org/licenses/by/4.0/, and PUB

8. Open morphology of Finnish

Creator:: Pirinen, Tommi A, Listenmaa, Inari, Johnson, Ryan, Tyers, Francis M., and Kuokkala, Juha
Publisher:: University of Helsinki
Type:: tool and toolService
Subject:: morphological analysis and morphological dictionary
Language:: Finnish
Description:: Omorfi is free and open source project containing various tools and data for handling Finnish texts in a linguistically motivated manner. The main components of this repository are: 1) a lexical database containing hundreds of thousands of words (c.f. lexical statistics), 2) a collection of scripts to convert lexical database into formats used by upstream NLP tools (c.f. lexical processing), 3) an autotools setup to build and install (or package, or deploy): the scripts, the database, and simple APIs / convenience processing tools, and 4) a collection of relatively simple APIs for a selection of languages and scripts to apply the NLP tools and access the database
Rights:: GNU General Public Licence, version 3, http://opensource.org/licenses/GPL-3.0, and PUB

9. Universal Segmentations 1.0 (UniSegments 1.0)

Creator:: Žabokrtský, Zdeněk, Bafna, Nyati, Bodnár, Jan, Kyjánek, Lukáš, Svoboda, Emil, Ševčíková, Magda, Vidra, Jonáš, Angle, Sachi, Ansari, Ebrahim, Arkhangelskiy, Timofey, Batsuren, Khuyagbaatar, Bella, Gábor, Bertinetto, Pier Marco, Bonami, Olivier, Celata, Chiara, Daniel, Michael, Fedorenko, Alexei, Filko, Matea, Giunchiglia, Fausto, Haghdoost, Hamid, Hathout, Nabil, Khomchenkova, Irina, Khurshudyan, Victoria, Levonian, Dmitri, Litta, Eleonora, Medvedeva, Maria, Muralikrishna, S. N., Namer, Fiammetta, Nikravesh, Mahshid, Padó, Sebastian, Passarotti, Marco, Plungian, Vladimir, Polyakov, Alexey, Potapov, Mihail, Pruthwik, Mishra, Rao B, Ashwath, Rubakov, Sergei, Samar, Husain, Sharma, Dipti Misra, Šnajder, Jan, Šojat, Krešimir, Štefanec, Vanja, Talamo, Luigi, Tribout, Delphine, Vodolazsky, Daniil, Vydrin, Arseniy, Zakirova, Aigul, and Zeller, Britta
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text, lexicon, and lexicalConceptualResource
Subject:: universal segmentations, morphological segmentation, word segmentation, segmentation, morphology, morphemes, morphological dictionary, unisegments, morph, and multilingual
Language:: Czech, Catalan, German, English, Persian, Finnish, French, Serbo-Croatian, Croatian, Hungarian, Italian, Komi-Zyrian, Latin, Moksha, Mari (Russia), Mongolian, Erzya, Polish, Portuguese, Russian, Spanish, Swedish, Tajik, Udmurt, Armenian, Bengali, Hindi, Malayalam, Marathi, and Kannada
Description:: Universal Segmentations (UniSegments) is a collection of lexical resources capturing morphological segmentations harmonised into a cross-linguistically consistent annotation scheme for many languages. The annotation scheme consists of simple tab-separated columns that stores a word and its morphological segmentations, including pieces of information about the word and the segmented units, e.g., part-of-speech categories, type of morphs/morphemes etc. The current public version of the collection contains 38 harmonised segmentation datasets covering 30 different languages.
Rights:: Universal Segmentations 1.0 License Terms, https://lindat.mff.cuni.cz/repository/xmlui/page/licence-unisegs-1.0, and PUB

10. Word representations for multiple languages

Creator:: Müller, Thomas and Schütze, Hinrich
Publisher:: Center for Information and Language Processing, University of Munich
Type:: text and corpus
Subject:: morphological dictionary, morphological analysis, and PoS tagging
Language:: English, German, Latin, Hungarian, Spanish, and Czech
Description:: Dictionaries with different representations for various languages. Representations include brown clusters of different sizes and morphological dictionaries extracted using different morphological analyzers. All representations cover the most frequent 250,000 word types on the Wikipedia version of the respective language. Analzers used: MAGYARLANC (Hungarian, Zsibrita et al. (2013)), FREELING (English and Spanish, Padro and Stanilovsky (2012)), SMOR (German, Schmid et al. (2004)), an MA from Charles University (Czech, Hajic (2001)) and LATMOR (Latin, Springmann et al. (2014)).
Rights:: Creative Commons - Attribution 3.0 Unported (CC BY 3.0), http://creativecommons.org/licenses/by/3.0/, and PUB

1. Italian Function Words

2. MorfFlex CZ 160310

3. MorfFlex CZ 161115

4. MorfFlex CZ 2.0

5. MorfFlex SK 170914

6. MorfoCzech

7. MorfoCzech 1.1

8. Open morphology of Finnish

9. Universal Segmentations 1.0 (UniSegments 1.0)

10. Word representations for multiple languages

Limit your search

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Search

Search Constraints

Search Results

Limit your search

Contributor

Creator

Show values starting with

Language

Show values starting with

Publisher

Rights

Show values starting with

Subject

Show values starting with

Type

Original context has metadata only

Harvested from