Harvested from: LINDAT/CLARIAH-CZ repository - LINDAT/CLARIAH-CZ Catalog Search Results

481. Italian Function Words

Creator:: Grella, Matteo
Publisher:: Matteo Grella
Type:: text, machineReadableDictionary, and lexicalConceptualResource
Subject:: morphological dictionary and function words
Language:: Italian
Description:: This dictionary is a curated list of Italian function words in a JSON Lines format text file, particularly useful for tasks such as POS-Tagging or Syntactic Parsing. It contains 999 single-word forms and 2501 multi-words forms. Each entry may have the following grammatical features: lemma, pos, mood, tense, person, number, gender, case, degree.
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

482. ItalWordNet

Type:: lexicalConceptualResource
Language:: Italian
Description:: 50.000 synsets, XML
Rights:: Not specified

483. iula_lexicon_lookup

Publisher:: Institut Universitari de Lingüística Aplicada, Universitat Pompeu Fabra
Type:: toolService
Description:: Lexicon lookup (given a word form, the webservice returns the information in the lexicon).
Rights:: Not specified

484. iula_preprocess

Publisher:: Institut Universitari de Lingüística Aplicada, Universitat Pompeu Fabra
Type:: toolService
Description:: Text preprocess (this preprocess service requires that the input text be in plain text format (file .txt) and UTF-8). Basically, it carries out: (i) text segmentation into minor structural units (titles, paragraphs, sentences, etc.); (ii) detection of entities not found in dictionaries (numbers, abbreviations, URLs, emails, proper nouns, etc.); and (iii) the keeping of sequences of two or more words in a single block (dates, phrases, proper nouns, etc.).
Rights:: Not specified

485. iula_tagger

Publisher:: Institut Universitari de Lingüística Aplicada, Universitat Pompeu Fabra
Type:: toolService
Description:: POS tagger. (The input file must be in plain text format (file.txt) and UTF-8 encoded. The disambiguation process is done by a TreeTagger instance trained by the IULA.)
Rights:: Not specified

486. iula_tokenizer

Publisher:: Institut Universitari de Lingüística Aplicada, Universitat Pompeu Fabra
Type:: toolService
Description:: Text tokenizer (the text tokenizer requires that the input text be in plain text format (file.txt) and UTF-8 encoded).
Rights:: Not specified

487. Iwaidja corpus

Type:: corpus
Description:: Documentation of the Iwaidja project (DoBeS project)
Rights:: Code of conduct

488. IWPT 2020 Shared Task Data and System Outputs

Creator:: Zeman, Daniel, Bouma, Gosse, and Seddah, Djamé
Publisher:: Universal Dependencies Consortium
Type:: text and corpus
Subject:: treebank, dependency, syntax, enhanced universal dependencies, shared task, and parsing
Language:: Arabic, Bulgarian, Czech, Dutch, English, Estonian, Finnish, French, Italian, Latvian, Lithuanian, Polish, Russian, Slovak, Swedish, Tamil, and Ukrainian
Description:: This package contains data used in the IWPT 2020 shared task. It contains training, development and test (evaluation) datasets. The data is based on a subset of Universal Dependencies release 2.5 (http://hdl.handle.net/11234/1-3105) but some treebanks contain additional enhanced annotations. Moreover, not all of these additions became part of Universal Dependencies release 2.6 (http://hdl.handle.net/11234/1-3226), which makes the shared task data unique and worth a separate release to enable later comparison with new parsing algorithms. The package also contains a number of Perl and Python scripts that have been used to process the data during preparation and during the shared task. Finally, the package includes the official primary submission of each team participating in the shared task.
Rights:: Licence Universal Dependencies v2.5, https://lindat.mff.cuni.cz/repository/xmlui/page/licence-UD-2.5, and PUB

489. IWPT 2021 Shared Task Data and System Outputs

Creator:: Zeman, Daniel, Bouma, Gosse, and Seddah, Djamé
Publisher:: Universal Dependencies Consortium
Type:: text and corpus
Subject:: treebank, dependency, syntax, enhanced universal dependencies, shared task, and parsing
Language:: Arabic, Bulgarian, Czech, Dutch, English, Estonian, Finnish, French, Italian, Latvian, Lithuanian, Polish, Russian, Slovak, Swedish, Tamil, and Ukrainian
Description:: This package contains data used in the IWPT 2021 shared task. It contains training, development and test (evaluation) datasets. The data is based on a subset of Universal Dependencies release 2.7 (http://hdl.handle.net/11234/1-3424) but some treebanks contain additional enhanced annotations. Moreover, not all of these additions became part of Universal Dependencies release 2.8 (http://hdl.handle.net/11234/1-3687), which makes the shared task data unique and worth a separate release to enable later comparison with new parsing algorithms. The package also contains a number of Perl and Python scripts that have been used to process the data during preparation and during the shared task. Finally, the package includes the official primary submission of each team participating in the shared task.
Rights:: Licence Universal Dependencies v2.7, https://lindat.mff.cuni.cz/repository/xmlui/page/license-ud-2.7, and PUB

490. Jaguar

Publisher:: Institut Universitari de Lingüística Aplicada, Universitat Pompeu Fabra
Type:: toolService
Description:: A tool for statistical corpus exploitation. It offers concordances, counts ngrams, extracts collocations and gives association, distribution and similarity measures.
Rights:: Not specified

481. Italian Function Words

482. ItalWordNet

483. iula_lexicon_lookup

484. iula_preprocess

485. iula_tagger

486. iula_tokenizer

487. Iwaidja corpus

488. IWPT 2020 Shared Task Data and System Outputs

489. IWPT 2021 Shared Task Data and System Outputs

490. Jaguar

Limit your search

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Search

Search Constraints

Search Results

Limit your search

Contributor

Show values starting with

Coverage

Show values starting with

Creator

Show values starting with

Format

Language

Show values starting with

Publisher

Show values starting with

Rights

Show values starting with

Subject

Show values starting with

Type

Show values starting with

Original context has metadata only

Harvested from