dc.contributor.author |
Žabokrtský, Zdeněk |
dc.contributor.author |
Bafna, Nyati |
dc.contributor.author |
Bodnár, Jan |
dc.contributor.author |
Kyjánek, Lukáš |
dc.contributor.author |
Svoboda, Emil |
dc.contributor.author |
Ševčíková, Magda |
dc.contributor.author |
Vidra, Jonáš |
dc.contributor.author |
Angle, Sachi |
dc.contributor.author |
Ansari, Ebrahim |
dc.contributor.author |
Arkhangelskiy, Timofey |
dc.contributor.author |
Batsuren, Khuyagbaatar |
dc.contributor.author |
Bella, Gábor |
dc.contributor.author |
Bertinetto, Pier Marco |
dc.contributor.author |
Bonami, Olivier |
dc.contributor.author |
Celata, Chiara |
dc.contributor.author |
Daniel, Michael |
dc.contributor.author |
Fedorenko, Alexei |
dc.contributor.author |
Filko, Matea |
dc.contributor.author |
Giunchiglia, Fausto |
dc.contributor.author |
Haghdoost, Hamid |
dc.contributor.author |
Hathout, Nabil |
dc.contributor.author |
Khomchenkova, Irina |
dc.contributor.author |
Khurshudyan, Victoria |
dc.contributor.author |
Levonian, Dmitri |
dc.contributor.author |
Litta, Eleonora |
dc.contributor.author |
Medvedeva, Maria |
dc.contributor.author |
Muralikrishna, S. N. |
dc.contributor.author |
Namer, Fiammetta |
dc.contributor.author |
Nikravesh, Mahshid |
dc.contributor.author |
Padó, Sebastian |
dc.contributor.author |
Passarotti, Marco |
dc.contributor.author |
Plungian, Vladimir |
dc.contributor.author |
Polyakov, Alexey |
dc.contributor.author |
Potapov, Mihail |
dc.contributor.author |
Pruthwik, Mishra |
dc.contributor.author |
Rao B, Ashwath |
dc.contributor.author |
Rubakov, Sergei |
dc.contributor.author |
Samar, Husain |
dc.contributor.author |
Sharma, Dipti Misra |
dc.contributor.author |
Šnajder, Jan |
dc.contributor.author |
Šojat, Krešimir |
dc.contributor.author |
Štefanec, Vanja |
dc.contributor.author |
Talamo, Luigi |
dc.contributor.author |
Tribout, Delphine |
dc.contributor.author |
Vodolazsky, Daniil |
dc.contributor.author |
Vydrin, Arseniy |
dc.contributor.author |
Zakirova, Aigul |
dc.contributor.author |
Zeller, Britta |
dc.date.accessioned |
2022-01-24T15:25:57Z |
dc.date.available |
2022-01-24T15:25:57Z |
dc.date.issued |
2022-01-17 |
dc.identifier.uri |
http://hdl.handle.net/11234/1-4629 |
dc.description |
Universal Segmentations (UniSegments) is a collection of lexical resources capturing morphological segmentations harmonised into a cross-linguistically consistent annotation scheme for many languages. The annotation scheme consists of simple tab-separated columns that stores a word and its morphological segmentations, including pieces of information about the word and the segmented units, e.g., part-of-speech categories, type of morphs/morphemes etc. The current public version of the collection contains 38 harmonised segmentation datasets covering 30 different languages. |
dc.language.iso |
ces |
dc.language.iso |
cat |
dc.language.iso |
deu |
dc.language.iso |
eng |
dc.language.iso |
fas |
dc.language.iso |
fin |
dc.language.iso |
fra |
dc.language.iso |
hbs |
dc.language.iso |
hrv |
dc.language.iso |
hun |
dc.language.iso |
ita |
dc.language.iso |
kpv |
dc.language.iso |
lat |
dc.language.iso |
mdf |
dc.language.iso |
chm |
dc.language.iso |
mon |
dc.language.iso |
myv |
dc.language.iso |
pol |
dc.language.iso |
por |
dc.language.iso |
rus |
dc.language.iso |
spa |
dc.language.iso |
swe |
dc.language.iso |
tgk |
dc.language.iso |
udm |
dc.language.iso |
hye |
dc.language.iso |
ben |
dc.language.iso |
hin |
dc.language.iso |
mal |
dc.language.iso |
mar |
dc.language.iso |
kan |
dc.publisher |
Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL) |
dc.relation.isreferencedby |
https://ufal.mff.cuni.cz/techrep/tr69.pdf |
dc.rights |
Universal Segmentations 1.0 License Terms |
dc.rights.uri |
https://lindat.mff.cuni.cz/repository/xmlui/page/licence-unisegs-1.0 |
dc.source.uri |
https://ufal.mff.cuni.cz/universal-segmentations |
dc.subject |
universal segmentations |
dc.subject |
morphological segmentation |
dc.subject |
word segmentation |
dc.subject |
segmentation |
dc.subject |
morphology |
dc.subject |
morphemes |
dc.subject |
morphological dictionary |
dc.subject |
unisegments |
dc.subject |
morph |
dc.subject |
multilingual |
dc.title |
Universal Segmentations 1.0 (UniSegments 1.0) |
dc.type |
lexicalConceptualResource |
metashare.ResourceInfo#ContentInfo.mediaType |
text |
metashare.ResourceInfo#ContentInfo.detailedType |
lexicon |
dc.rights.label |
PUB |
has.files |
yes |
branding |
LINDAT / CLARIAH-CZ |
contact.person |
Jonáš Vidra vidra@ufal.mff.cuni.cz Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL) |
contact.person |
Zdeněk Žabokrtský zabokrtsky@ufal.mff.cuni.cz Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL) |
sponsor |
Grantová agentura České Republiky 19-14534S Popis slovotvorné struktury českých slov na základě jazykových dat nationalFunds |
sponsor |
Charles University START/HUM/010 A data-based approach to competition in word-formation: selected semantic categories across seven languages nationalFunds |
sponsor |
Univerzita Karlova (mimo GAUK) SVV 260 453 Specifický vysokoškolský výzkum nationalFunds |
sponsor |
Ministerstvo školství, mládeže a tělovýchovy České republiky LM2015071 LINDAT/CLARIN: Institut pro analýzu, zpracování a distribuci lingvistických dat nationalFunds |
sponsor |
Ministerstvo školství, mládeže a tělovýchovy České republiky LM2018101 LINDAT/CLARIAH-CZ: Digitální výzkumná infrastruktura pro jazykové technologie, umění a humanitní vědy nationalFunds |
size.info |
38 files |
files.size |
136889577 |
files.count |
1 |