« Previous |
1 - 10 of 20
|
Next »
Number of results to display per page
Search Results
2. A / Věda a výzkum
- Type:
- model:periodicalitem and TEXT
- Language:
- Czech
- Description:
- 1
- Rights:
- http://creativecommons.org/licenses/by-nc-sa/4.0/ and policy:public
3. Annotation of Dramatic Situations in Theater Play Scripts (2023)
- Creator:
- Mareček, David, Nováková, Marie, Vosecká, Klára, Doležal, Josef, and Rosa, Rudolf
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL) and The Academy of Performing Arts in Prague, Theatre Faculty (DAMU)
- Type:
- text and corpus
- Subject:
- theatre, play script, and dramatic situation
- Language:
- Czech
- Description:
- We defined 58 dramatic situations and annotated them in 19 play scripts. Then we selected only 5 well-recognized dramatic situations and annotated further 33 play scripts. In the previous (first) version, we released 9 play scripts that could be freely distributed. In this (second) version of the data, we are adding another 10 plays for which we have obtained licenses from authors. In total, there are 19 play scripts available, and one of them is annotated three times - independently by three annotators.
- Rights:
- THEAITRE AI research only license, https://lindat.mff.cuni.cz/repository/xmlui/page/theaitre-license, and ACA
4. Československá psychologie: časopis pro psychologickou teorii a praxi
- Type:
- model:periodicalitem and TEXT
- Language:
- Czech, Slovak, and English
- Description:
- 1
- Rights:
- http://creativecommons.org/licenses/by-nc-sa/4.0/ and policy:public
5. Dostojevského Deník spisovatele v kontextech a konfrontacích /
- Creator:
- Odehnalová, Lenka,
- Type:
- text and monografie
- Subject:
- Ruská literatura (o ní), Dostojevskij, Fedor Michajlovič,, spisovatelé ruští, literatura ruská, deníky, Rusko, světové dějiny 1789-1918, and literatura, spisovatelé
- Language:
- Czech
- Rights:
- unknown
6. E-psychologie: elektronický časopis ČMPS
- Type:
- model:periodicalitem and TEXT
- Language:
- Czech
- Description:
- 1
- Rights:
- http://creativecommons.org/licenses/by-nc-sa/4.0/ and policy:public
7. English gustatory adjectives and lexical synaesthesia - data analysis
- Creator:
- Jurčević, Jana
- Publisher:
- Faculty of Humanities and Social Sciences, University of Rijeka
- Type:
- text, wordList, and lexicalConceptualResource
- Subject:
- lexical synaesthesia, metaphorical collocations, metonymy, cross-modal mapping, and embodiment
- Language:
- English
- Description:
- Data collection has been done by the means of Sketch Engine program. Data were extrapolated from the annotated English web corpus enTenTen20. Data collection and analysis has been done during the period of two months: April and May 2023. Recently, the enTenTen20 corpus has been updated to a newer version - enTenTen21. Nevertheless, the older version is still available, can be worked on and can be compared with the newer one. It has been noticed that the differences between the two versions of the English web corpus did not affect the results of this study. The only apparent difference was seen in slightly different numbers in frequency values for specific collocations. This was expected since the older version of web corpus consists of 36 billion words, while the new version counts 52 billion words. On the other hand, as noted above, these frequency deviations were not significant enough to refute the hypotheses. They have rather confirmed them once again. This study is one of the results of work on a larger scientific-research project called "Metaphorical collocations - syntagmatic relations between semantics and pragmatics". More information about the project is available on the following link: https://metakol.uniri.hr/en/opis-projekta/ The study has been financed by the Croatian science foundation. Working with the data/replicating the study: Data collected for the purposes of this study is available in CSV format. Data for each gustatory adjective (collocate) is presented in a separate CSV file. Upon opening each file, stretch the borders of every column for better visibility of data. Tables show different collocational bases (nouns) which are found in the corpus, in combination with a specific gustatory adjective, their collocate. These nouns are listed by their score number (The Mutual Information score expresses the extent to which words co-occur compared to the number of times they appear separately). Tables show what type of mapping is present in a certain collocation (e.g., intra-modal or cross-modal). Tables show what type of meaning or cognitive process is working in the background of the meaning formation (e.g., metonymic or metaphoric). For every analyzed collocation, we provided a contextualized example of its use from the corpus, along with the hyperlink where it can be found.
- Rights:
- Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB
8. Filosofický časopis
- Type:
- model:periodicalitem and TEXT
- Language:
- Czech, English, Slovak, and French
- Description:
- 1
- Rights:
- http://creativecommons.org/licenses/by-nc-sa/4.0/ and policy:public
9. Journal of hydrology and hydromechanics
- Type:
- model:periodicalitem and TEXT
- Language:
- Slovak, English, and Czech
- Description:
- 1
- Rights:
- http://creativecommons.org/licenses/by-nc-sa/4.0/ and policy:public
10. MLASK: Multimodal Summarization of Video-based News Articles
- Creator:
- Krubiński, Mateusz and Pecina, Pavel
- Publisher:
- Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
- Type:
- video and corpus
- Subject:
- Multimodal Summarization, Summarization, Video, and Image
- Language:
- Czech
- Description:
- The MLASK corpus consists of 41,243 multi-modal documents – video-based news articles in the Czech language – collected from Novinky.cz (https://www.novinky.cz/) and Seznam Zprávy (https://www.seznamzpravy.cz/). It was introduced in "MLASK: Multimodal Summarization of Video-based News Articles" (Krubiński & Pecina, EACL 2023). The articles' publication dates range from September 2016 to February 2022. The intended use case of the dataset is to model the task of multimodal summarization with multimodal output: based on a pair of a textual article and a short video, a textual summary is generated, and a single frame from the video is chosen as a pictorial summary. Each document consists of the following: - a .mp4 video - a single image (cover picture) - the article's text - the article's summary - the article's title - the article's publication date All of the videos are re-sampled to 25 fps and resized to the same resolution of 1280x720p. The maximum length of the video is 5 minutes, and the shortest one is 7 seconds. The average video duration is 86 seconds. The quantitative statistics of the lengths of titles, abstracts, and full texts (measured in the number of tokens) are below. Q1 and Q3 denote the first and third quartiles, respectively. / - / mean / Q1 / Median / Q3 / / Title / 11.16 ± 2.78 / 9 / 11 / 13 / / Abstract / 33.40 ± 13.86 / 22 / 32 / 43 / / Article / 276.96 ± 191.74 / 154 / 231 / 343 / The proposed training/dev/test split follows the chronological ordering based on publication data. We use the articles published in the first half (Jan-Jun) of 2021 for validation (2,482 instances) and the ones published in the second half (Jul-Dec) of 2021 and the beginning (Jan-Feb) of 2022 for testing (2,652 instances). The remaining data is used for training (36,109 instances). The textual data is shared as a single .tsv file. The visual data (video+image) is shared as a single archive for validation and test splits, and the one from the training split is partitioned based on the publication date.
- Rights:
- Seznam Dataset Licence, https://lindat.mff.cuni.cz/repository/xmlui/page/szn-dataset-licence, and RES