Number of results to display per page
Search Results
592. Lifestory Archive (lifestories data base)
- Publisher:
- Institute of Philosophy and Sociology of the University of Latvia
- Type:
- corpus
- Language:
- Latvian
- Description:
- The lifestory is a source for qualitative research. The most basic component of the collection is the written or recorded document of personal history, a short or lengthy story of a person's life and observations. National Oral History Project (Nacionālās mutvārdu vēstures projekts (NMV)) has been financed by the Science Council of Latvia (Latvijas Zinātnes Padome) since 1992. Its approach is multidisciplinary, employing sociological and philosophical theories in particular.
- Rights:
- Not specified
593. LiFR-Law. Corpus of Paraphrased Czech Administrative Texts with Reading Comprehension for Readability Studies
- Creator:
- Cinková, Silvie, Chromý, Jan, Šamánková, Jana, Hořeňovská, Karolína, Kettnerová, Václava, Kolářová, Veronika, Kubištová, Hana, and Panevová, Jarmila
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- readability, legal texts, legal domain, reading comprehension, corpus, and survey
- Language:
- Czech
- Description:
- LiFR-Law is a corpus of Czech legal and administrative texts with measured reading comprehension and a subjective expert annotation of diverse textual properties based on the Hamburg Comprehensibility Concept (Langer, Schulz von Thun, Tausch, 1974). It has been built as a pilot data set to explore the Linguistic Factors of Readability (hence the LiFR acronym) in Czech administrative and legal texts, modeling their correlation with actually observed reading comprehension. The corpus is comprised of 18 documents in total; that is, six different texts from the legal/administration domain, each in three versions: the original and two paraphrases. Each such document triple shares one reading-comprehension test administered to at least thirty readers of random gender, educational background, and age. The data set also captures basic demographic information about each reader, their familiarity with the topic, and their subjective assessment of the stylistic properties of the given document, roughly corresponding to the key text properties identified by the Hamburg Comprehensibility Concept.
- Rights:
- Creative Commons - Attribution 4.0 International (CC BY 4.0), http://creativecommons.org/licenses/by/4.0/, and PUB
594. LiFR-Law. Corpus of Paraphrased Czech Administrative Texts with Reading Comprehension for Readability Studies (2023-10-08)
- Creator:
- Cinková, Silvie, Chromý, Jan, Šamánková, Jana, Hořeňovská, Karolína, Kettnerová, Václava, Kolářová, Veronika, Kubištová, Hana, and Panevová, Jarmila
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- readability, legal texts, legal domain, reading comprehension, corpus, and survey
- Language:
- Czech
- Description:
- LiFR-Law is a corpus of Czech legal and administrative texts with measured reading comprehension and a subjective expert annotation of diverse textual properties based on the Hamburg Comprehensibility Concept (Langer, Schulz von Thun, Tausch, 1974). It has been built as a pilot data set to explore the Linguistic Factors of Readability (hence the LiFR acronym) in Czech administrative and legal texts, modeling their correlation with actually observed reading comprehension. The corpus is comprised of 18 documents in total; that is, six different texts from the legal/administration domain, each in three versions: the original and two paraphrases. Each such document triple shares one reading-comprehension test administered to at least thirty readers of random gender, educational background, and age. The data set also captures basic demographic information about each reader, their familiarity with the topic, and their subjective assessment of the stylistic properties of the given document, roughly corresponding to the key text properties identified by the Hamburg Comprehensibility Concept. Changes to the previous version and helpful comments • File names of the comprehension test results (self-explanatory) • Corrected one erroneous automatic evaluation rule in the multiple-choice evaluation (zahradnici_3, TRUE and FALSE had been swapped) • Evaluation protocols for both question types added into Folder lifr_formr_study_design • Data has been cleaned: empty responses to multiple-choice questions were re-inserted. Now, all surveys are considered complete that have reader’s subjective text evaluation complete (these were placed at the very end of each survey). • Only complete surveys (all 7 content questions answered) are represented. We dropped the replies of six users who did not complete their surveys. • A few missing responses to open questions have been detected and re-inserted. • The demographic data contain all respondents who filled in the informed consent and the demographic details, with respondents who did not complete any test survey (but provided their demographic details) in a separate file. All other data have been cleaned to contain only responses by the regular respondents (at least one completed survey).
- Rights:
- Creative Commons - Attribution 4.0 International (CC BY 4.0), http://creativecommons.org/licenses/by/4.0/, and PUB
595. LiFR-Lite
- Creator:
- Cinková, Silvie, Chromý, Jan, Hořeňovská, Karolína, Kettnerová, Václava, Kolářová, Veronika, Panevová, Jarmila, and Ševčíková, Magda
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- education, readability, reading comprehension, and text corpora
- Language:
- Czech
- Description:
- Corpus of Czech educational texts for readability studies, with paraphrases, measured reading comprehension, and a multi-annotator subjective rating of selected text features based on the Hamburg Comprehensibility Concept
- Rights:
- Creative Commons - Attribution 4.0 International (CC BY 4.0), http://creativecommons.org/licenses/by/4.0/, and PUB
596. LiFR-Lite (2021-11-05)
- Creator:
- Cinková, Silvie, Chromý, Jan, Hořeňovská, Karolína, Kettnerová, Václava, Kolářová, Veronika, Kubištová, Hana, Panevová, Jarmila, and Ševčíková, Magda
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- text and corpus
- Subject:
- education, readability, reading comprehension, and text corpora
- Language:
- Czech
- Description:
- Corpus of Czech educational texts for readability studies, with paraphrases, measured reading comprehension, and a multi-annotator subjective rating of selected text features based on the Hamburg Comprehensibility Concept
- Rights:
- Creative Commons - Attribution 4.0 International (CC BY 4.0), http://creativecommons.org/licenses/by/4.0/, and PUB
597. LIMAS Corpus
- Publisher:
- Korpora.org and Fakultät Geisteswissenschaften, Universität Duisburg-Essen
- Type:
- corpus
- Subject:
- Germanistik
- Language:
- German
- Description:
- 1970s "representative" corpus of German created by the research group "Linguistik und Maschinelle Sprachbearbeitung" (linguistics and language processing); Zeitschnittkorpus der deutschen Schriftsprache von 1970; Querschnitt durch verschiedene Textsorten
- Rights:
- Not specified
598. LINDAT Translation service
- Creator:
- Košarko, Ondřej, Variš, Dušan, and Popel, Martin
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- service and toolService
- Subject:
- machine translation and frontend
- Description:
- Source code of the LINDAT Translation service frontend. The service provides a UI and a simple rest api that accesses machine translation models served by tensorflow serving. The most recent version of the code is available at https://github.com/ufal/lindat_translation.
- Rights:
- BSD 2-Clause "Simplified" or "FreeBSD" license, http://opensource.org/licenses/BSD-2-Clause, and PUB
599. Lingua::Interset 2.026
- Creator:
- Zeman, Daniel
- Publisher:
- Charles University, Faculty of Mathematics and Physics
- Type:
- tool and toolService
- Subject:
- morphology, part of speech, conversion, and tagset
- Language:
- Arabic, Bulgarian, Bengali, Catalan, Czech, Danish, German, Modern Greek (1453-), English, Spanish, Estonian, Basque, Persian, Finnish, Ancient Greek (to 1453), Hebrew, Hindi, Croatian, Japanese, Multiple languages, and Portuguese
- Description:
- Lingua::Interset is a universal morphosyntactic feature set to which all tagsets of all corpora/languages can be mapped. Version 2.026 covers 37 different tagsets of 21 languages. Limited support of the older drivers for other languages (which are not included in this package but are available for download elsewhere) is also available; these will be fully ported to Interset 2 in future. Interset is implemented as Perl libraries. It is also available via CPAN.
- Rights:
- Artistic License (Perl) 1.0, http://opensource.org/licenses/Artistic-Perl-1.0, and PUB
600. LiStr: Linguistic Structure Induction Tookit
- Creator:
- Mareček, David and Straka, Milan
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- tool and toolService
- Subject:
- parsing, unsupervised machine learning, machine translation, and grammar induction
- Language:
- English
- Description:
- This toolkit comprises the tools and supporting scripts for unsupervised induction of dependency trees from raw texts or texts with already assigned part-of-speech tags. There are also scripts for simple machine translation based on unsupervised parsing and scripts for minimally supervised parsing into Universal-Dependencies style.
- Rights:
- GNU General Public Licence, version 3, http://opensource.org/licenses/GPL-3.0, and PUB