Harvested from: LINDAT/CLARIAH-CZ repository - LINDAT/CLARIAH-CZ Catalog Search Results

821. GECCC Grammar Error Correction Corpus for Czech (2022-09-28)

Creator:: Náplava, Jakub, Straka, Milan, Straková, Jana, and Rosen, Alexandr
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: gec, grammatical error correction, and dataset
Language:: Czech
Description:: Grammar Error Correction Corpus for Czech (GECCC) consists of 83 058 sentences and covers four diverse domains, including essays written by native students, informal website texts, essays written by Romani ethnic minority children and teenagers and essays written by nonnative speakers. All domains are professionally annotated for GEC errors in a unified manner, and errors were automatically categorized with a Czech-specific version of ERRANT released at https://github.com/ufal/errant_czech The dataset was introduced in the paper Czech Grammar Error Correction with a Large and Diverse Corpus that was accepted to TACL. Until published in TACL, see the arXiv version: https://arxiv.org/pdf/2201.05590.pdf This version fixes double annotation errors in train and dev M2 files, and also contains more metadata information.
Rights:: Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0), PUB, and http://creativecommons.org/licenses/by-sa/4.0/

822. Gender-fair language on the websites of German, Austrian, Swiss and South Tyrolean cities

Creator:: Müller-Spitzer, Carolin and Ochs, Samira
Publisher:: IDS Mannheim
Type:: text and corpus
Subject:: gender-fair language, websites, personal designations, gender-inclusive language, and gender linguistics
Language:: German
Description:: Annotated dataset consisting of personal designations found on websites of 42 German, Austrian, Swiss and South Tyrolean cities. Our goal is to re-evaluate the websites every year in order to see how the use of gender-fair language develops over time. The dataset contains coordinates for the creation of map material.
Rights:: Creative Commons - Attribution 4.0 International (CC BY 4.0), http://creativecommons.org/licenses/by/4.0/, and PUB

823. General Syrový Addresses Citizens

Creator:: Aktualita
Publisher:: Národní filmový archiv
Type:: video and clip
Subject:: projev Syrový Jan, Mnichovská dohoda, and People::Syrový Jan (1888-1970)
Language:: Czech
Description:: The segment of Československý zvukový týdeník Aktualita (Czechoslovak Aktualita Sound Newsreel) from late September 1938 captures the recording of a radio speech given by General Jan Syrový to accept his appointment to the office of Prime Minister on 22 September 1938, in which he responds to the national demonstration for the unity of Czechoslovakia held in front of the Parliament building in Prague. He urges the demonstrators, as well as all citizens, to remain calm and sensible and to return to work.
Rights:: http://creativecommons.org/licenses/by-nc-nd/4.0/, PUB, and Creative Commons - Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)

824. Generator of Czech lyrics according to structure

Creator:: Štěpánková, Barbora
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: tool and toolService
Subject:: Song lyrics generation
Language:: Czech
Description:: Fine-tuned Czech TinyLlama model (https://huggingface.co/BUT-FIT/CSTinyLlama-1.2B) and Czech GPT2 small model (https://huggingface.co/lchaloupsky/czech-gpt2-oscar) to generate lyrics of song sections based on the provided syllable counts, keywords and rhyme scheme. The TinyLlama-based model yields better results, however, the GPT2-based model can run locally. Both models are discussed in a Bachelor Thesis: Generation of Czech Lyrics to Cover Songs.
Rights:: The MIT License (MIT), http://opensource.org/licenses/mit-license.php, and PUB

825. GerManC : A representative historical corpus of German 1650-1800

Type:: corpus
Language:: German
Description:: The ultimate aim of the project is to compile a representative historical corpus of written German for the years 1650-1800. The complete GerManC corpus will contain 2000 word samples from nine genres
Rights:: Not specified

826. Gesprächanalytisches Informationssystem (GAIS)

Publisher:: Institut für Deutsche Sprache
Type:: toolService
Language:: German
Description:: web-based information system on scientific community (news, events, persons, job market, mailing list, database on research projects and corpora, bibliography, glossary and links) and recording equipment/software; disciplinary scope: research on conversation and discourse analysis and spoken language
Rights:: Not specified

830. Glossa corpus search system

Creator:: Nøklestad, Anders
Publisher:: Department of Linguistics and Nordic Studies, University of Oslo
Type:: toolService
Description:: Glossa is a web-based system for corpus search and results management. It comes with built-in support for CLARIN federated content search as well as corpora encoded with the IMS Corpus Workbench. It also has a plugin architecture that enables other search engines to be used once a wrapper has been created.Glossa can be freely downloaded and installed on the user's server. It currently supports only monolignual written corpora, but support for multilingual corpora is under development, as well as support for spoken corpora with audio, video and maps.
Rights:: Not specified

821. GECCC Grammar Error Correction Corpus for Czech (2022-09-28)

822. Gender-fair language on the websites of German, Austrian, Swiss and South Tyrolean cities

823. General Syrový Addresses Citizens

824. Generator of Czech lyrics according to structure

825. GerManC : A representative historical corpus of German 1650-1800

826. Gesprächanalytisches Informationssystem (GAIS)

827. Gestor de diccionaris

828. Géza Včelička, true name Antonín Eduard Včelička (writer)

829. Giuseppe Dalla Torre on Czechoslovakia

830. Glossa corpus search system

Limit your search

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Search

Search Constraints

Search Results

Limit your search

Contributor

Show values starting with

Coverage

Show values starting with

Creator

Show values starting with

Format

Language

Show values starting with

Publisher

Show values starting with

Rights

Show values starting with

Subject

Show values starting with

Type

Show values starting with

Date

Original context has metadata only

Harvested from