dc.contributor.author | Poláková, Lucie |
dc.contributor.author | Zikánová, Šárka |
dc.contributor.author | Mírovský, Jiří |
dc.contributor.author | Hajičová, Eva |
dc.date.accessioned | 2023-07-03T09:26:36Z |
dc.date.available | 2023-07-03T09:26:36Z |
dc.date.issued | 2023-06-30 |
dc.identifier.uri | http://hdl.handle.net/11234/1-5174 |
dc.description | The Czech RST Discourse Treebank 1.0 (CzRST-DT 1.0) is a dataset of 54 Czech journalistic texts manually annotated using the Rhetorical Structure Theory (RST). Each text document in the treebank is represented as a single tree-like structure, the nodes (discourse units) are interconnected through hierarchical rhetorical relations. The dataset also contains concurrent annotations of five double-annotated documents. The original texts are a part of the data annotated in the Prague Dependency Treebank, although the two projects are independent. |
dc.language.iso | ces |
dc.publisher | Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL) |
dc.rights | Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-sa/4.0/ |
dc.source.uri | https://ufal.mff.cuni.cz/czrst-dt1.0 |
dc.subject | discourse |
dc.subject | discourse annotation |
dc.subject | annotated corpus |
dc.title | Czech RST Discourse Treebank 1.0 |
dc.type | corpus |
metashare.ResourceInfo#ContentInfo.mediaType | text |
dc.rights.label | PUB |
has.files | yes |
branding | LINDAT / CLARIAH-CZ |
contact.person | Lucie Poláková polakova@ufal.mff.cuni.cz Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL) |
sponsor | The Grant Agency of the Czech Republic 20-09853S Global Coherence of Czech Texts in the Corpus-Based Perspective nationalFunds |
size.info | 54 articles |
size.info | 901 sentences |
size.info | 14514 tokens |
files.size | 212249 |
files.count | 2 |
Soubory tohoto záznamu
Stáhnout všechny soubory záznamu (207.27 KB)Licenční kategorie:
Licence: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
Publicly Available
Licence: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
- Název
- CzRST-DT_1.0.zip
- Velikost
- 203.38 KB
- Formát
- application/zip
- Popis
- CzRST-DT 1.0 distribution
- MD5
- 93b2a2beab1ff13f7dd652fa5de74bfb
- CzRST-DT_1.0
- README.TXT3 kB
- data
- IAA
- RS3
- ln94207_39.rs33 kB
- mf920925_021.rs34 kB
- lnd94103_003.rs32 kB
- lnd94103_063.rs311 kB
- cmpr9413_017.rs37 kB
- ln95048_056.rs38 kB
- ln95049_086.rs35 kB
- ln94202_49.rs33 kB
- ln94200_8.rs32 kB
- mf930713_099.rs35 kB
- ln94207_83.rs311 kB
- mf930713_055.rs36 kB
- ln95047_134.rs35 kB
- ln94200_112.rs35 kB
- ln95048_140.rs34 kB
- ln94203_145.rs37 kB
- ln94202_135.rs34 kB
- mf920922_138.rs33 kB
- cmpr9415_032.rs34 kB
- cmpr9410_047.rs312 kB
- ln94200_84.rs33 kB
- ln95048_055.rs34 kB
- ln94207_54.rs311 kB
- cmpr9413_004.rs34 kB
- lnd94103_129.rs33 kB
- ln94200_167.rs33 kB
- mf920925_087.rs34 kB
- mf930713_110.rs311 kB
- lnd94103_013.rs34 kB
- lnd94103_145.rs36 kB
- lnd94103_033.rs33 kB
- ln95048_058.rs35 kB
- mf930709_087.rs38 kB
- mf920922_105.rs39 kB
- mf920925_018.rs35 kB
- ln94203_100.rs35 kB
- lnd94103_053.rs33 kB
- ln95049_100.rs38 kB
- ln94210_147.rs37 kB
- mf920925_114.rs34 kB
- mf930709_083.rs33 kB
- ln95048_122.rs33 kB
- ln95049_019.rs33 kB
- mf920922_133.rs33 kB
- mf930713_013.rs34 kB
- ln94209_45.rs39 kB
- mf930709_058.rs34 kB
- ln94207_16.rs38 kB
- ln94203_43.rs34 kB
- ln94200_170.rs311 kB
- cmpr9413_026.rs33 kB
- ln94208_143.rs33 kB
- cmpr9413_034.rs38 kB
- ln94206_47.rs37 kB
- TXT
- lnd94103_013.txt1 kB
- lnd94103_145.txt1 kB
- lnd94103_033.txt733 B
- mf930709_087.txt2 kB
- ln95048_058.txt1 kB
- mf920922_105.txt2 kB
- mf920925_018.txt1 kB
- ln94203_100.txt1 kB
- lnd94103_053.txt1008 B
- ln95049_100.txt1 kB
- ln94210_147.txt2 kB
- mf930709_083.txt944 B
- mf920925_114.txt997 B
- ln95048_122.txt622 B
- ln95049_019.txt821 B
- mf930713_013.txt1 kB
- mf920922_133.txt595 B
- ln94209_45.txt3 kB
- mf930709_058.txt883 B
- ln94207_16.txt2 kB
- ln94203_43.txt1 kB
- ln94200_170.txt3 kB
- cmpr9413_026.txt709 B
- ln94208_143.txt1 kB
- cmpr9413_034.txt2 kB
- ln94206_47.txt1 kB
- ln94207_39.txt940 B
- lnd94103_003.txt684 B
- mf920925_021.txt1 kB
- lnd94103_063.txt4 kB
- cmpr9413_017.txt2 kB
- ln95049_086.txt1 kB
- ln94202_49.txt909 B
- ln95048_056.txt2 kB
- ln94200_8.txt735 B
- mf930713_099.txt967 B
- ln94207_83.txt3 kB
- mf930713_055.txt2 kB
- ln95047_134.txt1 kB
- ln94200_112.txt1 kB
- ln95048_140.txt1 kB
- ln94203_145.txt2 kB
- ln94202_135.txt1 kB
- mf920922_138.txt801 B
- cmpr9415_032.txt1 kB
- cmpr9410_047.txt4 kB
- ln94200_84.txt1 kB
- ln94207_54.txt3 kB
- ln95048_055.txt960 B
- cmpr9413_004.txt1 kB
- lnd94103_129.txt568 B
- mf920925_087.txt1 kB
- ln94200_167.txt1 kB
- mf930713_110.txt3 kB
- Název
- README.TXT
- Velikost
- 3.9 KB
- Formát
- Textový soubor
- Popis
- CzRST-DT 1.0 README
- MD5
- 04ec03e6206fb2ba96141b2c1967eabc
=============================================== Czech RST Discourse Treebank 1.0 (CzRST-DT 1.0) =============================================== Authors ======= Lucie Poláková (Charles University, Faculty of Mathematics and Physics), Šárka Zikánová (Charles University, Faculty of Mathematics and Physics), Jiří Mírovský (Charles University, Faculty of Mathematics and Physics) Eva Hajičová (Charles University, Faculty of Mathematics and Physics), Introduction ============ The Czech RST Discourse Treebank 1.0 (CzRST-DT 1.0, Poláková et al., 2023) is a dataset of 54 Czech journalistic texts manually annotated using the Rhetorical Structure Theory (RST; Mann and Thompson, 1988). Each text document in the treebank is represented as a single tree-like structure, the nodes (discourse units) are interconnected through hierarchical rhetorical relations. The dataset also contains concurrent annotations of five double-annotated documents. The original texts are a part of the data annotated . . .