Coverage: Czech Republic - LINDAT/CLARIAH-CZ Catalog Search Results

Publisher:: Charles University
Type:: corpus
Language:: Czech
Description:: The Prague family of annotated corpora has a new member, the Czech Academic Corpus version 2.0 (CAC 2.0). CAC 2.0 consists of 650,000 words from various 1970s and 1980s newspapers, magazines and radio and television broadcast transcripts manually annotated for morphology and syntax.
Rights:: LDC Licence and LDC Catalog No.: LDC2008T22

Publisher:: Masaryk University, Brno
Type:: corpus
Language:: Czech and English
Description:: Parallel corpus, 3,297,283 words. The idea was to create a small parallel corpus which would enable to work with entire texts in translation analysis rather then short extracts. At the same time it aimed at acquiring experience that could be used in creating a larger parallel corpus of English and Czech in the future. Although the main part of work has been completed -- and the aims of the KACENKA grant met -- we keep improving and enlarging KACENKA gradually. Currently, it has the size of 3,297,283 words (out of which, 1,689,513 have been acquired by means of scanning). Most of the English texts for KACENKA have been retrieved from the Internet resources. The rest -- and nearly all the Czech texts -- had to be scanned with the use of an OCR programme. KACENKA is stored on a single CD-ROM; its use is limited by copyright restrictions.
Rights:: Not specified

Creator:: Žabokrtský, Zdeněk
Publisher:: Charles University
Type:: toolService
Description:: TectoMT is a highly modular NLP (Natural Language Processing) software system implemented in Perl programming language under Linux. It is primarily aimed at Machine Translation, making use of the ideas and technology created during the Prague Dependency Treebank project. At the same time, it is also hoped to significantly facilitate and accelerate development of software solutions of many other NLP tasks, especially due to re-usability of the numerous integrated processing modules (called blocks), which are equipped with uniform object-oriented interfaces.
Rights:: Not specified

Publisher:: University of Western Bohemia, Pilsen and Charles University
Type:: toolService
Description:: The TrEdVoice module is designed to be TrEd annotation editor accessories enabling the voice control of its functions.
Rights:: Not specified

Publisher:: Germanic Lexicon Project
Type:: toolService
Description:: A dictionary of reconstructed Proto-Germanic, organized by reconstructed lemmata, with each entry including the attested reflexes in the daughter Germanic languages, as well as cognates in the other Indo-European branches.
Rights:: Not specified

Search