Language: Czech / Rights: PUB - LINDAT/CLARIAH-CZ Catalog Search Results

371. Otakar Švec (sculptor)

Creator:: Krátký film and Veselý, Bohumil
Publisher:: Národní filmový archiv
Type:: video and clip
Subject:: ateliér sochařský, sochaři oři práci, model pomníku, pomník Stafin Josif Vissarionovič model, Galerie osobností, Places::Praha::Dejvice::ateliér Bohumila Kafky, People::Švec Otakar (1892-1955), People::Košík Arnošt (1920-1990), and Československý filmový týdeník 1953/22
Language:: Czech
Description:: Sculptor Otakar Švec with his colleagues František Žemlička and Arnošt Košík working on a model for the Stalin Monument in a segment from Československý filmový týdeník (Czechoslovak Film Weekly Newsreel) 1953, issue no. 22.
Rights:: http://creativecommons.org/licenses/by-nc-nd/4.0/, PUB, and Creative Commons - Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)

372. Otakar Vávra (director)

Creator:: Aktualita and Veselý, Bohumil
Publisher:: Národní filmový archiv
Type:: video and clip
Subject:: ateliér filmový, režisér filmový, kameraman filmový, film Humoreska natáčení, Galerie osobností, Places::Praha::Nové Město::Školská::pavlač domu, Places::Praha::Barrandov::filmové ateliéry /int./, People::Vávra Otakar (1911-2011), People::Roth Jan (1899-1972), and Český zvukový týdeník Aktualita::1939/41B
Language:: Czech
Description:: Director Otakar Vávra with cinematographer Jan Roth during the shooting of Humoreska (Humoresque, dir. Otakara Vávra, 1939) in a segment from Československý zvukový týdeník Aktualita (Czechoslovak Aktualita Sound Newsreel) 1939, issue no. 41B.
Rights:: http://creativecommons.org/licenses/by-nc-nd/4.0/, PUB, and Creative Commons - Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)

373. OVM – Otázky Václava Moravce

Creator:: Šmídl, Luboš and Pražák, Aleš
Publisher:: University of West Bohemia, Department of Cybernetics
Type:: audio and corpus
Subject:: speech corpus, acoustic model, speaker identification, and speaker verification
Language:: Czech
Description:: The corpus consists of transcribed recordings from the Czech political discussion broadcast “Otázky Václava Moravce“. It contains 35 hours of speech and corresponding word-by-word transcriptions, including the transcription of some non-speech events. Speakers’ names are also assigned to corresponding segments. The resulting corpus is suitable for both acoustic model training for ASR purposes and training of speaker identification and/or verification systems. The archive contains 16 sound files (WAV PCM, 16-bit, 48 kHz, mono) and transcriptions in XML-based standard Transcriber format (http://trans.sourceforge.net)
Rights:: Attribution-NonCommercial 3.0 Unported (CC BY-NC 3.0), http://creativecommons.org/licenses/by-nc/3.0/, and PUB

374. Package of word embeddings of Czech from a large corpus

Creator:: Kyjánek, Lukáš and Bonami, Olivier
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text, computationalLexicon, and lexicalConceptualResource
Subject:: word embeddings, word vectors, large corpus, word2vec, skipgram, and cbow
Language:: Czech
Description:: This package comprises eight models of Czech word embeddings trained by applying word2vec (Mikolov et al. 2013) to the currently most extensive corpus of Czech, namely SYN v9 (Křen et al. 2022). The minimum frequency threshold for including a word in the model was 10 occurrences in the corpus. The original lemmatisation and tagging included in the corpus were used for disambiguation. In the case of word embeddings of word forms, units comprise word forms and their tag from a positional tagset (cf. https://wiki.korpus.cz/doku.php/en:pojmy:tag) separated by '>', e.g., kočka>NNFS1-----A----. The published package provides models trained on both tokens and lemmas. In addition, the models combine training algorithms (CBOW and Skipgram) and dimensions of the resulting vectors (100 or 500), while the training window and negative sampling remained the same during the training. The package also includes files with frequencies of word forms (vocab-frequencies.forms) and lemmas (vocab-frequencies.lemmas).
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

375. ParaCrawl Corpus version 1.0

Creator:: Koehn, Philipp, Heafield, Kenneth, Forcada, Mikel L., Esplà-Gomis, Miquel, Ortiz-Rojas, Sergio, Sánchez, Gema Ramírez, Cartagena, Víctor M. Sánchez, Haddow, Barry, Bañón, Marta, Střelec, Marek, Samiotou, Anna, and Kamran, Amir
Publisher:: ParaCrawl
Type:: text and corpus
Subject:: ParaCrawl, parallel corpus, CommonCrawl, machine translation, and text corpora
Language:: English, German, French, Spanish, Italian, Portuguese, Dutch, Polish, Czech, Romanian, Finnish, Latvian, Russian, and Estonian
Description:: The January 2018 release of the ParaCrawl is the first version of the corpus. It contains parallel corpora for 11 languages paired with English, crawled from a large number of web sites. The selection of websites is based on CommonCrawl, but ParaCrawl is extracted from a brand new crawl which has much higher coverage of these selected websites than CommonCrawl. Since the data is fairly raw, it is released with two quality metrics that can be used for corpus filtering. An official "clean" version of each corpus uses one of the metrics. For more details and raw data download please visit: http://paracrawl.eu/releases.html
Rights:: Public Domain Dedication (CC Zero), http://creativecommons.org/publicdomain/zero/1.0/, and PUB

376. ParaDi 2.0

Creator:: Barančíková, Petra and Kettnerová, Václava
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text, machineReadableDictionary, and lexicalConceptualResource
Subject:: multiword expressions, light verb construction, paraphrases, and idioms
Language:: Czech
Description:: ParaDi 2.0. is a dictionary of single verb paraphrases of Czech verbal multiword expressions - light verb constructions and idiomatic verb constructions. Moreover, it provides an elaborated set of morphological, syntactic and semantic features, including information on aspectual counterparts of verbs or paraphrasability conditions of given verbs. The format of ParaDi has been designed with respect to both human and machine readability - the dictionary is represented as a plain table in TSV format, as it is a flexible and language-independent data format.
Rights:: Creative Commons - Attribution 4.0 International (CC BY 4.0), http://creativecommons.org/licenses/by/4.0/, and PUB

377. ParaDi 2.0 (2018-01-24)

Creator:: Barančíková, Petra and Kettnerová, Václava
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text, machineReadableDictionary, and lexicalConceptualResource
Subject:: multiword expressions, light verb construction, paraphrases, and idioms
Language:: Czech
Description:: ParaDi 2.0. is a dictionary of single verb paraphrases of Czech verbal multiword expressions - light verb constructions and idiomatic verb constructions. Moreover, it provides an elaborated set of morphological, syntactic and semantic features, including information on aspectual counterparts of verbs or paraphrasability conditions of given verbs. The format of ParaDi has been designed with respect to both human and machine readability - the dictionary is represented as a plain table in TSV format, as it is a flexible and language-independent data format.
Rights:: Creative Commons - Attribution 4.0 International (CC BY 4.0), http://creativecommons.org/licenses/by/4.0/, and PUB

378. ParaDi: Dictionary of Paraphrases of Czech Complex Predicates with Light Verbs

Creator:: Barančíková, Petra and Kettnerová, Václava
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text, machineReadableDictionary, and lexicalConceptualResource
Subject:: light verb construction and paraphrases
Language:: Czech
Description:: Dictionary of single verb paraphrases of Czech light verb constructions.
Rights:: Public Domain Mark (PD), http://creativecommons.org/publicdomain/mark/1.0/, and PUB

379. Parallel Global Voices, Czech-English NER+NEL

Creator:: Nevěřilová, Zuzana and Žižková, Hana
Publisher:: Masaryk University, Brno
Type:: text, other, and lexicalConceptualResource
Subject:: named entity recognition, named entities, named entity, named entitity corpus, named entity linking, named entity disambiguation, and wikidata
Language:: English and Czech
Description:: Annotation of named entities to the existing source Parallel Global Voices, ces-eng language pair. The named entity annotations distinguish four classes: Person, Organization, Location, Misc. The annotation is in the IOB schema (annotation per token, beginning + inside of the multi-word annotation). NEL annotation contains Wikidata Qnames.
Rights:: Creative Commons - Attribution 4.0 International (CC BY 4.0), http://creativecommons.org/licenses/by/4.0/, and PUB

380. ParCzech 3.0

Creator:: Kopp, Matyáš, Stankov, Vladislav, Bojar, Ondřej, Hladká, Barbora, and Straňák, Pavel
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: audio and corpus
Subject:: Parliament of the Czech Republic, Chamber of Deputies, stenographic protocols, TEI encoding, and speech corpus
Language:: Czech
Description:: The ParCzech 3.0 corpus is the third version of ParCzech consisting of stenographic protocols that record the Chamber of Deputies’ meetings held in the 7th term (2013-2017) and the current 8th term (2017-Mar 2021). The protocols are provided in their original HTML format, Parla-CLARIN TEI format, and the format suitable for Automatic Speech Recognition. The corpus is automatically enriched with the morphological, syntactic, and named-entity annotations using the procedures UDPipe 2 and NameTag 2. The audio files are aligned with the texts in the annotated TEI files.
Rights:: Public Domain Dedication (CC Zero), http://creativecommons.org/publicdomain/zero/1.0/, and PUB

371. Otakar Švec (sculptor)

372. Otakar Vávra (director)

373. OVM – Otázky Václava Moravce

374. Package of word embeddings of Czech from a large corpus

375. ParaCrawl Corpus version 1.0

376. ParaDi 2.0

377. ParaDi 2.0 (2018-01-24)

378. ParaDi: Dictionary of Paraphrases of Czech Complex Predicates with Light Verbs

379. Parallel Global Voices, Czech-English NER+NEL

380. ParCzech 3.0

Limit your search

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Search

Search Constraints

Search Results

Limit your search

Contributor

Show values starting with

Creator

Show values starting with

Language

Show values starting with

Publisher

Show values starting with

Rights

Show values starting with

Subject

Show values starting with

Type

Show values starting with

Date

Original context has metadata only

Harvested from