dc.contributor.author | Vysušilová, Petra |
dc.contributor.author | Straka, Milan |
dc.date.accessioned | 2021-11-18T15:58:05Z |
dc.date.available | 2021-11-18T15:58:05Z |
dc.date.issued | 2021 |
dc.identifier.uri | http://hdl.handle.net/11234/1-4613 |
dc.description | Model trained for Czech POS Tagging and Lemmatization using Czech version of BERT model, RobeCzech. Model is trained on data from Prague Dependency Treebank 3.5. Model is a part of Czech NLP with Contextualized Embeddings master thesis and presented a state-of-the-art performance on the date of submission of the work. Demo jupyter notebook is available on the project GitHub. |
dc.language.iso | ces |
dc.publisher | Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL) |
dc.relation.isreferencedby | https://dspace.cuni.cz/handle/20.500.11956/147648 |
dc.rights | Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-sa/4.0/ |
dc.subject | BERT |
dc.subject | PoS tagging |
dc.subject | lemmatization |
dc.title | POS Tagging and Lemmatization (Czech model) |
dc.type | languageDescription |
metashare.ResourceInfo#ContentInfo.mediaType | text |
metashare.ResourceInfo#ContentInfo.detailedType | mlmodel |
dc.rights.label | PUB |
has.files | yes |
branding | LINDAT / CLARIAH-CZ |
demo.uri | https://github.com/flower-go/DiplomaThesis |
contact.person | Petra Vysušilová vysusilova@ktiml.mff.cuni.cz Charles University, Faculty of Mathematics and Physics |
files.size | 1823072785 |
files.count | 4 |
Files in this item
This item is
Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
Publicly Available
and licensed under:Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
- Name
- ch18.index
- Size
- 16.79 KB
- Format
- Unknown
- Description
- TensorFlow checkpoint data index
- MD5
- 99a09ee9ba3531fdba323db57a4554c8
- Name
- mappings.pickle
- Size
- 40.81 MB
- Format
- Unknown
- Description
- Mappings
- MD5
- 363f9a3b8d82610fcb99773c2eb5e856
- Name
- ch18.data-00000-of-00001
- Size
- 847.49 MB
- Format
- Unknown
- Description
- TensorFlow checkpoint data
- MD5
- 273672b0bb2f180a6ad6e223f696d58d
- Name
- forms.vectors-w5-d300-ns5.16b.npz
- Size
- 850.3 MB
- Format
- Unknown
- Description
- Pretrained embeddings needed for the model construction
- MD5
- 1691478ca44620a734dff58c8bd6b7fd