Number of results to display per page
Search Results
92. HamleDT 2.0
- Creator:
- Zeman, Daniel, Mareček, David, Mašek, Jan, Popel, Martin, Ramasamy, Loganathan, Rosa, Rudolf, Štěpánek, Jan, and Žabokrtský, Zdeněk
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- treebank, Stanford dependencies, Prague dependencies, harmonization, common annotation style, and Interset
- Language:
- Arabic, Bulgarian, Bengali, Catalan, Czech, Danish, German, Modern Greek (1453-), English, Spanish, Estonian, Basque, Persian, Finnish, Ancient Greek (to 1453), Hindi, Hungarian, Italian, Japanese, Latin, Dutch, Portuguese, Romanian, Russian, Slovak, Slovenian, Swedish, Tamil, Telugu, and Turkish
- Description:
- HamleDT 2.0 is a collection of 30 existing treebanks harmonized into a common annotation style, the Prague Dependencies, and further transformed into Stanford Dependencies, a treebank annotation style that became popular recently. We use the newest basic Universal Stanford Dependencies, without added language-specific subtypes.
- Rights:
- HamleDT 2.0 Licence Agreement, https://lindat.mff.cuni.cz/repository/xmlui/page/licence-hamledt-2.0, and ACA
93. HamleDT 3.0
- Creator:
- Zeman, Daniel, Mareček, David, Mašek, Jan, Popel, Martin, Ramasamy, Loganathan, Rosa, Rudolf, Štěpánek, Jan, and Žabokrtský, Zdeněk
- Publisher:
- Charles University
- Type:
- text and corpus
- Subject:
- annotated corpus, morphology, syntax, dependency, treebank, harmonized annotation, and common annotation style
- Language:
- Arabic, Basque, Bengali, Bulgarian, Catalan, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Modern Greek (1453-), Ancient Greek (to 1453), Hebrew, Hindi, Hungarian, Indonesian, Irish, Italian, Japanese, Latin, Persian, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Spanish, Swedish, Tamil, Telugu, and Turkish
- Description:
- HamleDT (HArmonized Multi-LanguagE Dependency Treebank) is a compilation of existing dependency treebanks (or dependency conversions of other treebanks), transformed so that they all conform to the same annotation style. This version uses Universal Dependencies as the common annotation style. Update (November 1017): for a current collection of harmonized dependency treebanks, we recommend using the Universal Dependencies (UD). All of the corpora that are distributed in HamleDT in full are also part of the UD project; only some corpora from the Patch group (where HamleDT provides only the harmonizing scripts but not the full corpus data) are available in HamleDT but not in UD.
- Rights:
- HamleDT 3.0 License Terms, https://lindat.mff.cuni.cz/repository/xmlui/page/licence-hamledt-3.0, and PUB
94. Hana Vítová (actress)
- Creator:
- Aktualita and Veselý, Bohumil
- Publisher:
- Národní filmový archiv
- Type:
- video and clip
- Subject:
- film Valentin Dobrotivý ukázka, Galerie osobností, People::Vítová Hana (1914-1987), People::Nový Oldřich (1899-1983), and Český zvukový týdeník Aktualita::1942/49
- Language:
- German and Czech
- Description:
- Actress Hana Vítová in an unidentified German film (sound). Vítová with actor Oldřich Nový in Valentin Dobrotivý (Valentin the Good, dir. Martin Frič, 1942). Vítová with her husband, critic Bedřich Rádl, in a segment from Československý zvukový týdeník Aktualita (Czechoslovak Aktualita Sound Newsreel) 1942, issue no. 49.
- Rights:
- http://creativecommons.org/licenses/by-nc-nd/4.0/, PUB, and Creative Commons - Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)
95. Herders Conversations-Lexikon
- Type:
- lexicalConceptualResource
- Subject:
- Germanistik
- Language:
- German
- Description:
- 1. Aufl. 1854-1857; disziplinübergreifende Darstellung von Gegenstandsbereichen gesellschaftlicher Konversation
- Rights:
- Not specified
96. HetWiK: Heterogene Widerstandskulturen
- Creator:
- Schuster, Britt-Marie, Markewitz, Friedrich, Wilk, Nicole M., Schröder, Sarah, and Rüdiger, Jan Oliver
- Publisher:
- Universität Paderborn
- Type:
- text and corpus
- Subject:
- corpus, annotated corpus, Widerstand, Widerstandskorpus, Jüngere Sprachgeschichte, Kommunikationsgeschichte, Nationalsozialismus, Sprachliche Praktiken, Soziale Identität, Beziehungskonstitution, Faktizitätsherstellung, Argumentieren, Direktiva, resistance, resistance corpus, recent language history, communication history, National Socialism, linguistic practices, social identity, relationship formation, creating facticity, argumentation, directive speech acts, and speech act
- Language:
- German
- Description:
- The representative full-text digitalized HetWiK corpus is composed of 140 manually annotated texts of the German Resistance between 1933 and 1945. This includes both well-known and relatively unknown documents, public writings, like pamphlets or memoranda, as well as private texts, e.g. letters, journal or prison entries and biographies. Thus the corpus represents the diverse groups as well as the heterogeneity of verbal resistance and allows the study of resistance in relation to the language usage. The HetWiK corpus can be used free of charge. A detailed register of the individual texts and further information about the tagset can be found on the project-homepage (german). In addition to the CATMA5 XML-format we provide a standoff-JSON format and CEC6-Files (CorpusExplorer) - so you can export the HetWiK corpus in different formats.
- Rights:
- Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB
97. HWC2023 –Hamburg.de Website Corpus 2023
- Creator:
- Rüdiger, Jan Oliver
- Publisher:
- Leibniz-Institut für Deutsche Sprache
- Type:
- text and corpus
- Subject:
- corpus, Web corpus, web corpora, Germanistik, German, websites, crawling corpus, and CorpusExplorer
- Language:
- German
- Description:
- A petition for a referendum (called: "Schluss mit Gendersprache in Verwaltung und Bildung" / eng.: "abolition of gender language in administration and education") was formed in Hamburg in February 2023. The project "Empirical Gender Linguistics" at the "Leibniz Institute for the German Language" took this as an opportunity to completely scrap the "https://www.hamburg.de" website (except the list of ships in the Port of Hamburg and the yellow page). The Hamburg.de website is the central digital contact point for citizens. The scraped texts were cleaned, processed and annotated using http://www.CorpusExplorer.de (TreeTagger - POS/Lemma information). We use the corpus to analyze the use of words with gender signs.
- Rights:
- Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0), PUB, and http://creativecommons.org/licenses/by-nc-sa/3.0/
98. JRC-Acquis
- Publisher:
- Joint Research Centre of the EU
- Type:
- corpus
- Language:
- Bulgarian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Modern Greek (1453-), Hungarian, Italian, Latvian, Maltese, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, and Swedish
- Description:
- The largest parallel corpus, contains EU law, the Acquis Communautaire in 22 languages.
- Rights:
- Not specified
99. Juilland-D-Korpus
- Publisher:
- Berlin-Brandenburg Academy of Sciences and Humanities
- Format:
- application/tei+xml
- Type:
- corpus
- Language:
- German
- Description:
- Written German from 1920-39. 500,000 tokens, 392 texts. POS and lemma, TEI XML. Part of Das digitale Wörterbuch der deutschen Sprache der 20. Jahrhunderts
- Rights:
- Not specified
100. Kali-Korpus
- Publisher:
- Leibniz Universität Hannover
- Type:
- corpus
- Subject:
- Germanistik
- Language:
- German
- Description:
- Diachronic corpus with focus on annotation and lemmatization of verbal categories; diachrones Korpus mit Fokus auf Annotation und Lemmatisierung von Verbalkategorien
- Rights:
- Not specified