Harvested from: LINDAT/CLARIAH-CZ repository - LINDAT/CLARIAH-CZ Catalog Search Results

254. Damen Conversations Lexikon

Type:: lexicalConceptualResource
Subject:: Germanistik
Language:: German
Description:: Neusatz und Faksimile der zehnbändigen Ausgabe (Leipzig, 1834-1838); wortgenaue Seitenkonkordanz zu der gedruckten Ausgabe; Darstellung der Gegenstandsbereiche gesellschaftlicher Konversation (speziell auf eine weibliche Zielgruppe ausgerichtet)
Rights:: Not specified

255. DaMuEL 1.0: A Large Multilingual Dataset for Entity Linking

Creator:: Kubeša, David and Straka, Milan
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: entity linking, NEL, NER, dataset, and knowledge base
Language:: Afrikaans, Arabic, Armenian, Basque, Belarusian, Bulgarian, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, Galician, German, Hebrew, Hindi, Hungarian, Indonesian, Irish, Italian, Japanese, Korean, Latin, Latvian, Lithuanian, Maltese, Marathi, Modern Greek (1453-), Northern Sami, Norwegian Nynorsk, Persian, Polish, Portuguese, Romanian, Russian, Scottish Gaelic, Serbian, Slovak, Slovenian, Spanish, Swedish, Tamil, Telugu, Uighur, Ukrainian, Urdu, Vietnamese, and Wolof
Description:: We present DaMuEL, a large Multilingual Dataset for Entity Linking containing data in 53 languages. DaMuEL consists of two components: a knowledge base that contains language-agnostic information about entities, including their claims from Wikidata and named entity types (PER, ORG, LOC, EVENT, BRAND, WORK_OF_ART, MANUFACTURED); and Wikipedia texts with entity mentions linked to the knowledge base, along with language-specific text from Wikidata such as labels, aliases, and descriptions, stored separately for each language. The Wikidata QID is used as a persistent, language-agnostic identifier, enabling the combination of the knowledge base with language-specific texts and information for each entity. Wikipedia documents deliberately annotate only a single mention for every entity present; we further automatically detect all mentions of named entities linked from each document. The dataset contains 27.9M named entities in the knowledge base and 12.3G tokens from Wikipedia texts. The dataset is published under the CC BY-SA licence.
Rights:: Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0), http://creativecommons.org/licenses/by-sa/4.0/, and PUB

256. Danish Fungi 2020

Creator:: Picek, Lukáš, Šulc, Milan, Matas, Jiří, Jeppesen, Thomas S., Heilmann-Clausen, Jacob, Læssøe, Thomas, and Frøslev, Tobias
Publisher:: IEEE/CVF
Type:: IMAGE and corpus
Subject:: Fungi, image processing, Classification, and Fine-grained
Description:: Danish Fungi 2020 (DF20) is a fine-grained dataset and benchmark. The dataset, constructed from observations submitted to the Danish Fungal Atlas, is unique in its taxonomy-accurate class labels, small number of errors, highly unbalanced long-tailed class distribution, rich observation metadata, and well-defined class hierarchy. DF20 has zero overlap with ImageNet, allowing unbiased comparison of models fine-tuned from publicly available ImageNet checkpoints. The dataset has 1,604 different classes, with 248,466 training images and 27,608 test images.
Rights:: GNU Library or "Lesser" General Public License 3.0 (LGPL-3.0), http://opensource.org/licenses/LGPL-3.0, and PUB

257. DanNet

Type:: lexicalConceptualResource
Language:: Danish
Rights:: Not specified

258. Das virtuelle Preußische Urkundenbuch

Publisher:: Universität Hamburg
Type:: corpus
Subject:: Germanistik
Language:: German
Description:: Register of decrees as well as texts on the history of Prussia and the Teutonic Order; Regesten und Texte zur Geschichte Preußens und des Deutschen Ordens
Rights:: Not specified

259. Database of Bavarian Dialects (BayDat)

Creator:: Zimmermann, Ralf, Raaf, Manuel, König, Werner, Eichinger, Ludwig M., Eroms, Hans-Werner, Wolf, Norbert Richard, Munske, Horst Haider, and Hinderling, Robert
Publisher:: Bayerische Akademie der Wissenschaften
Type:: text and corpus
Subject:: Bavarian, Swabian, Germanistik, Dialektologie, dialect variation, dialectology, Bairisch, Fränkisch, Schwäbisch, Bayern, Sprachtatlas von Unterfranken, Sprachatlas von Mittelfranken, Sprachatlas von Bayerisch-Schwaben, Sprachatlas von Oberbayern, Bayerischer Sprachatlas, BSA, Sprachatlas von Nordostbayern, and Sprachtatlas von Niederbayern
Language:: Bavarian, Swabian, Frankish, and German
Description:: The database contains about 5 Million dialectal linguistic evidences collected in differend projects within the Free State of Bavaria to the dialects Bavarian, Frankish, and Swabian. In 1984, linguists at the University of Augsburg began to collect dialect data for the research and documentation project "Linguistic Map of Swabia" (German: "Sprachatlas von Bayerisch-Schwaben (SBS)"). In 1986, the University of Bayreuth followed with preparations for the "Linguistic Map of North- and East-Bavaria" (German: "Sprachatlas von Nordostbayern (SNOB)"). In the following years, partner projects of the other regions also started to collect data in their particular region. All six language projects then formed the "Research Association of the Bavarian Linguistic Map " (German: Bayerischer Sprachatlas (BSA)"), which was funded by the DFG and the Bavarian State Ministry of Science, Research and the Arts. The first digital publication of BayDat by Ralf Zimmermann in 2007 at the University of Würzburg (see linked paper) was re-designed in 2019 by Manuel Raaf at the Bavarian Academy of Sciences and Humanities. For detailed information, please see https://baydat.badw.de/info
Rights:: Not specified

260. Database of Estonian Multi-word Verbs

Type:: lexicalConceptualResource
Language:: Estonian
Description:: 17 500 entries
Rights:: Not specified

251. Czesl - Universal Dependencies Release 0.5

252. CzeSL Grammatical Error Correction Dataset (CzeSL-GEC)

253. czTenTen12 v9 subcorpus of problematic phenomena

254. Damen Conversations Lexikon

255. DaMuEL 1.0: A Large Multilingual Dataset for Entity Linking

256. Danish Fungi 2020

257. DanNet

258. Das virtuelle Preußische Urkundenbuch

259. Database of Bavarian Dialects (BayDat)

260. Database of Estonian Multi-word Verbs

Limit your search

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Search

Search Constraints

Search Results

Limit your search

Contributor

Show values starting with

Coverage

Show values starting with

Creator

Show values starting with

Format

Language

Show values starting with

Publisher

Show values starting with

Rights

Show values starting with

Subject

Show values starting with

Type

Show values starting with

Original context has metadata only

Harvested from