This package contains data used in the IWPT 2021 shared task. It contains training, development and test (evaluation) datasets. The data is based on a subset of Universal Dependencies release 2.7 (http://hdl.handle.net/11234/1-3424) but some treebanks contain additional enhanced annotations. Moreover, not all of these additions became part of Universal Dependencies release 2.8 (http://hdl.handle.net/11234/1-3687), which makes the shared task data unique and worth a separate release to enable later comparison with new parsing algorithms. The package also contains a number of Perl and Python scripts that have been used to process the data during preparation and during the shared task. Finally, the package includes the official primary submission of each team participating in the shared task.
"Large Scale Colloquial Persian Dataset" (LSCP) is hierarchically organized in asemantic taxonomy that focuses on multi-task informal Persian language understanding as a comprehensive problem. LSCP includes 120M sentences from 27M casual Persian tweets with its dependency relations in syntactic annotation, Part-of-speech tags, sentiment polarity and automatic translation of original Persian sentences in five different languages (EN, CS, DE, IT, HI).
Částečně přeloženo z češtiny, Kolektivní monografie vychází z podkladů mezinárodní konference "Ludvík Salvátor Toskánský (1847-1915). Vědec a cestovatel" konané v Brandýse nad Labem ve dnech 13.-15.10.2017, and Chronologické přehledy
Částečně přeloženo z češtiny, Kolektivní monografie vychází z podkladů mezinárodní konference "Ludvík Salvátor Toskánský (1847-1915). Vědec a cestovatel" konané v Brandýse nad Labem ve dnech 13.-15.10.2017, and Chronologické přehledy