Number of results to display per page
Search Results
22. EvaLatin 2020 models for UDPipe 2 (2020-08-31)
- Creator:
- Straka, Milan
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- tool and toolService
- Subject:
- POS tagger, lemmatization, and tagger
- Language:
- Latin
- Description:
- POS Tagger and Lemmatizer models for EvaLatin2020 data (https://github.com/CIRCSE/LT4HALA). The model documentation including performance can be found at https://ufal.mff.cuni.cz/udpipe/2/models#evalatin20_models . To use these models, you need UDPipe version at least 2.0, which you can download from https://ufal.mff.cuni.cz/udpipe/2 .
- Rights:
- Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB
23. HamleDT 2.0
- Creator:
- Zeman, Daniel, Mareček, David, Mašek, Jan, Popel, Martin, Ramasamy, Loganathan, Rosa, Rudolf, Štěpánek, Jan, and Žabokrtský, Zdeněk
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- treebank, Stanford dependencies, Prague dependencies, harmonization, common annotation style, and Interset
- Language:
- Arabic, Bulgarian, Bengali, Catalan, Czech, Danish, German, Modern Greek (1453-), English, Spanish, Estonian, Basque, Persian, Finnish, Ancient Greek (to 1453), Hindi, Hungarian, Italian, Japanese, Latin, Dutch, Portuguese, Romanian, Russian, Slovak, Slovenian, Swedish, Tamil, Telugu, and Turkish
- Description:
- HamleDT 2.0 is a collection of 30 existing treebanks harmonized into a common annotation style, the Prague Dependencies, and further transformed into Stanford Dependencies, a treebank annotation style that became popular recently. We use the newest basic Universal Stanford Dependencies, without added language-specific subtypes.
- Rights:
- HamleDT 2.0 Licence Agreement, https://lindat.mff.cuni.cz/repository/xmlui/page/licence-hamledt-2.0, and ACA
24. HamleDT 3.0
- Creator:
- Zeman, Daniel, Mareček, David, Mašek, Jan, Popel, Martin, Ramasamy, Loganathan, Rosa, Rudolf, Štěpánek, Jan, and Žabokrtský, Zdeněk
- Publisher:
- Charles University
- Type:
- text and corpus
- Subject:
- annotated corpus, morphology, syntax, dependency, treebank, harmonized annotation, and common annotation style
- Language:
- Arabic, Basque, Bengali, Bulgarian, Catalan, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Modern Greek (1453-), Ancient Greek (to 1453), Hebrew, Hindi, Hungarian, Indonesian, Irish, Italian, Japanese, Latin, Persian, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Spanish, Swedish, Tamil, Telugu, and Turkish
- Description:
- HamleDT (HArmonized Multi-LanguagE Dependency Treebank) is a compilation of existing dependency treebanks (or dependency conversions of other treebanks), transformed so that they all conform to the same annotation style. This version uses Universal Dependencies as the common annotation style. Update (November 1017): for a current collection of harmonized dependency treebanks, we recommend using the Universal Dependencies (UD). All of the corpora that are distributed in HamleDT in full are also part of the UD project; only some corpora from the Patch group (where HamleDT provides only the harmonizing scripts but not the full corpus data) are available in HamleDT but not in UD.
- Rights:
- HamleDT 3.0 License Terms, https://lindat.mff.cuni.cz/repository/xmlui/page/licence-hamledt-3.0, and PUB
25. Kant-Korpus (Daten des Projekts Bereitstellung und Pflege von Immanuel Kants Werken in elektronischer Form)
- Publisher:
- Korpora.org and Fakultät Geisteswissenschaften, Universität Duisburg-Essen
- Type:
- corpus
- Subject:
- Germanistik
- Language:
- German and Latin
- Description:
- Philosophical texts of the 18th century: Full text of the authoritative "Akademie-Ausgabe" (excluding most footnotes and editorial notes) and reference texts like A.G. Baumgarten's "Metaphysica".
- Rights:
- Not specified
26. LatinISE corpus
- Creator:
- McGillivray, Barbara
- Publisher:
- Lexical Computing
- Type:
- text and corpus
- Subject:
- latin corpus
- Language:
- Latin
- Description:
- The LatinISE corpus is a text corpus collected from the LacusCurtius, Intratext and Musisque Deoque websites. Corpus texts have rich metadata containing information as genre, title, century or specific date. This Latin corpus was built by Barbara McGillivray.
- Rights:
- Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0), http://creativecommons.org/licenses/by-sa/4.0/, and PUB
27. LatinISE corpus (version 4)
- Creator:
- McGillivray, Barbara
- Publisher:
- Lexical Computing
- Type:
- text and corpus
- Subject:
- latin corpus
- Language:
- Latin
- Description:
- The LatinISE corpus is a text corpus collected from the LacusCurtius, Intratext and Musisque Deoque websites. Corpus texts have rich metadata containing information as genre, title, century or specific date. This Latin corpus was built by Barbara McGillivray. In the version 4 of the corpus the high frequency lemmas have been manually corrected and sentence boundaries have been added.
- Rights:
- Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB
28. Mannheimer Texte Online (MATEO)
- Publisher:
- Universität Mannheim
- Type:
- corpus
- Subject:
- Germanistik
- Language:
- German and Latin
- Description:
- As a sub-section of MATEO, MARABU (Mannheimer Reihe Altes Buch) includes illustrated books, (manu)scripts and texts on the history of the Electoral Palatinate. Als Unterkategorie von MATEO beinhaltet MARABU (Mannheimer Reihe Altes Buch) illustrierte Bücher, Handschriften und Rarissima, Quellen zur Geschichte der Kurpfalz sowie Beiträge über Frauen des Humanismus.
- Rights:
- Not specified
29. Medieval Charter Sections Corpus
- Creator:
- Galuščáková, Petra and Neužilová, Lucie
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- section detection, segmentation, and information retrieval
- Language:
- Latin and Czech
- Description:
- This package provides an evaluation framework, training and test data for semi-automatic recognition of sections of historical diplomatic manuscripts. The data collection consists of 57 Latin charters issued by the Royal Chancellery of 7 different types. Documents were created in the era of John the Blind, King of Bohemia (1310–1346) and Count of Luxembourg. Manuscripts were digitized, transcribed, and typical sections of medieval charters ('corroboratio', 'datatio', 'dispositio', 'inscriptio', 'intitulatio', 'narratio', and 'publicatio') were manually tagged. Manuscripts also contain additional metadata, such as manually marked named entities and short Czech abstracts. Recognition models are first trained using manually marked sections in training documents and the trained model can then be used for recognition of the sections in the test data. The parsing script supports methods based on Cosine Distance, TF-IDF weighting and adapted Viterbi algorithm.
- Rights:
- Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB
30. On-line Dictionary of medieval latin in the Czech lands
- Creator:
- Ctibor, Jan and Nývlt, Pavel
- Publisher:
- Institute of Philosophy of the Czech Academy of Sciences
- Type:
- text, lexicon, and lexicalConceptualResource
- Subject:
- dictionary, latin, Medieval, digital humanities, lexicography, and Medieval Latin
- Language:
- Latin and Czech
- Description:
- The Dictionary of Medieval Latin in the Czech Lands registers and explains the vocabulary of Medieval Latin as used in the Czech lands since the beginnings of Latin writing in this area (from about 1000 CE) to 1500 CE, so far covering the letters A-M. For more information about the Dictionary, see the webpage of the Department of Medieval Lexicography of the Institute of Philosophy of Czech Academy of Sciences. The data uploaded present the on-line version of the dictionary (API and XML data), making it possible to put the application into operation at a localhost.
- Rights:
- Dictionary of Medieval Latin in the Czech Lands - digital version 2.2 License Agreement, https://lindat.mff.cuni.cz/repository/xmlui/page/license-lb, and ACA