Rights: PUB / Type: tool - LINDAT/CLARIAH-CZ Catalog Search Results

31. MCSQ Translation Models (en-de) (v1.0)

Creator:: Variš, Dušan
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: tool and toolService
Subject:: machine translation, transformer, and neural machine translation
Language:: English and German
Description:: En-De translation models, exported via TensorFlow Serving, available in the Lindat translation service (https://lindat.mff.cuni.cz/services/translation/). The models were trained using the MCSQ social surveys dataset (available at https://repo.clarino.uib.no/xmlui/bitstream/handle/11509/142/mcsq_v3.zip). Their main use should be in-domain translation of social surveys. Models are compatible with Tensor2tensor version 1.6.6. For details about the model training (data, model hyper-parameters), please contact the archive maintainer. Evaluation on MCSQ test set (BLEU): en->de: 67.5 (train: genuine in-domain MCSQ data only) de->en: 75.0 (train: additional in-domain backtranslated MCSQ data) (Evaluated using multeval: https://github.com/jhclark/multeval)
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

32. MCSQ Translation Models (en-ru) (v1.0)

Creator:: Variš, Dušan
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: tool and toolService
Subject:: machine translation, neural machine translation, and transformer
Language:: English and Russian
Description:: En-Ru translation models, exported via TensorFlow Serving, available in the Lindat translation service (https://lindat.mff.cuni.cz/services/translation/). The models were trained using the MCSQ social surveys dataset (available at https://repo.clarino.uib.no/xmlui/bitstream/handle/11509/142/mcsq_v3.zip). Their main use should be in-domain translation of social surveys. Models are compatible with Tensor2tensor version 1.6.6. For details about the model training (data, model hyper-parameters), please contact the archive maintainer. Evaluation on MCSQ test set (BLEU): en->ru: 64.3 (train: genuine in-domain MCSQ data) ru->en: 74.7 (train: additional backtranslated in-domain MCSQ data) (Evaluated using multeval: https://github.com/jhclark/multeval)
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

33. Morphological Analyzer for Shipibo-Konibo

Creator:: Cardenas Acosta, Ronald and Zeman, Daniel
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: tool and toolService
Subject:: morphological analyzer and finite state transducer
Language:: Shipibo-Conibo
Description:: This tool is the first morphological analyzer ever for this language. The analyzer is a FST that produces all possible segmentations and tagging sequences in a word-by-word fashion.
Rights:: GNU General Public Licence, version 3, http://opensource.org/licenses/GPL-3.0, and PUB

34. MSTperl parser

Creator:: Rosa, Rudolf
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: toolService and tool
Subject:: parser, NLP, Treex, parsing, and dependency
Language:: Czech and English
Description:: MSTperl is a Perl reimplementation of the MST parser of Ryan McDonald (http://www.seas.upenn.edu/~strctlrn/MSTParser/MSTParser.html). MST parser (Maximum Spanning Tree parser) is a state-of-the-art natural language dependency parser -- a tool that takes a sentence and returns its dependency tree. In MSTperl, only some functionality was implemented; the limitations include the following: the parser is a non-projective one, curently with no possibility of enforcing the requirement of projectivity of the parse trees; only first-order features are supported, i.e. no second-order or third-order features are possible; the implementation of MIRA is that of a single-best MIRA, with a closed-form update instead of using quadratic programming. On the other hand, the parser supports several advanced features: parallel features, i.e. enriching the parser input with word-aligned sentence in other language; adding large-scale information, i.e. the feature set enriched with features corresponding to pointwise mutual information of word pairs in a large corpus (CzEng). The MSTperl parser is tuned for parsing Czech. Trained models are available for Czech, English and German. We can train the parser for other languages on demand, or you can train it yourself -- the guidelines are part of the documentation. The parser, together with detailed documentation, is avalable on CPAN (http://search.cpan.org/~rur/Treex-Parser-MSTperl/). and The research has been supported by the EU Seventh Framework Programme under grant agreement 247762 (Faust), and by the grants GAUK116310 and GA201/09/H057.
Rights:: Artistic License 2.0, http://opensource.org/licenses/Artistic-2.0, and PUB

35. MSTperl parser (2015-05-19)

Creator:: Rosa, Rudolf
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: toolService and tool
Subject:: parser, NLP, Treex, parsing, and dependency
Language:: Czech and English
Description:: MSTperl is a Perl reimplementation of the MST parser of Ryan McDonald (http://www.seas.upenn.edu/~strctlrn/MSTParser/MSTParser.html). MST parser (Maximum Spanning Tree parser) is a state-of-the-art natural language dependency parser -- a tool that takes a sentence and returns its dependency tree. In MSTperl, only some functionality was implemented; the limitations include the following: the parser is a non-projective one, curently with no possibility of enforcing the requirement of projectivity of the parse trees; only first-order features are supported, i.e. no second-order or third-order features are possible; the implementation of MIRA is that of a single-best MIRA, with a closed-form update instead of using quadratic programming. On the other hand, the parser supports several advanced features: parallel features, i.e. enriching the parser input with word-aligned sentence in other language; adding large-scale information, i.e. the feature set enriched with features corresponding to pointwise mutual information of word pairs in a large corpus (CzEng); weighted/unweighted parser model interpolation; combination of several instances of the MSTperl parser (through MST algorithm); combination of several existing parses from any parsers (through MST algorithm). The MSTperl parser is tuned for parsing Czech. Trained models are available for Czech, English and German. We can train the parser for other languages on demand, or you can train it yourself -- the guidelines are part of the documentation. The parser, together with detailed documentation, is avalable on CPAN (http://search.cpan.org/~rur/Treex-Parser-MSTperl/). and The research has been supported by the EU Seventh Framework Programme under grant agreement 247762 (Faust), and by the grants GAUK116310 and GA201/09/H057.
Rights:: Artistic License 2.0, http://opensource.org/licenses/Artistic-2.0, and PUB

36. MUSCIMarker

Creator:: Hajič, Jan Jr
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: tool and toolService
Subject:: image annotation, Python, and music notation
Description:: MUSCIMarker is an open-source tool for annotating visual objects and their relationships in binary images. It is implemented in Python, known to run on Windows, Linux and OS X, and supports working offline. MUSCIMarker is being used for creating a dataset of musical notation symbols, but can support any object set. The user documentation online is currently (12.2016) incomplete, as it is continually changing to reflect annotators' comments and incorporate new features. This version of the software is *not* the final one, and it is under continuous development (we're currently working on adding grayscale image support with auto-binarization, and Android support for touch-based annotation). However, the current version (1.1) has already been used to annotate more than 100 pages of sheet music, over all the major desktop OSes, and I believe it is already in a state where it can be useful beyond my immediate music notation data gathering use case.
Rights:: Apache License 2.0, http://opensource.org/licenses/Apache-2.0, and PUB

37. NameTag 2

Creator:: Straková, Jana
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: tool and toolService
Subject:: named entity recognition and named entity recognizer
Description:: NameTag 2 is a named entity recognition tool. It recognizes named entities (e.g., names, locations, etc.) and can recognize both flat and embedded (nested) entities. NameTag 2 can be used either as a commandline tool or by requesting the NameTag webservice. NameTag webservice can be found at: https://lindat.mff.cuni.cz/services/nametag/ NameTag commandline tool can be downloaded from NameTag GitHub repository, branch nametag2: git clone https://github.com/ufal/nametag -b nametag2 Latest models and documentation can be found at: https://ufal.mff.cuni.cz/nametag/2 This software subject to the terms of the Mozilla Public License, v. 2.0 (http://mozilla.org/MPL/2.0/). The associated models are distributed under CC BY-NC-SA license. Please cite as: Jana Straková, Milan Straka, Jan Hajič (2019): Neural Architectures for Nested NER through Linearization. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 5326-5331, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-950737-48-2 (https://aclweb.org/anthology/papers/P/P19/P19-1527/)
Rights:: Mozilla Public License 2.0, http://opensource.org/licenses/MPL-2.0, and PUB

38. Open morphology of Finnish

Creator:: Pirinen, Tommi A, Listenmaa, Inari, Johnson, Ryan, Tyers, Francis M., and Kuokkala, Juha
Publisher:: University of Helsinki
Type:: tool and toolService
Subject:: morphological analysis and morphological dictionary
Language:: Finnish
Description:: Omorfi is free and open source project containing various tools and data for handling Finnish texts in a linguistically motivated manner. The main components of this repository are: 1) a lexical database containing hundreds of thousands of words (c.f. lexical statistics), 2) a collection of scripts to convert lexical database into formats used by upstream NLP tools (c.f. lexical processing), 3) an autotools setup to build and install (or package, or deploy): the scripts, the database, and simple APIs / convenience processing tools, and 4) a collection of relatively simple APIs for a selection of languages and scripts to apply the NLP tools and access the database
Rights:: GNU General Public Licence, version 3, http://opensource.org/licenses/GPL-3.0, and PUB

39. Parsito

Creator:: Straka, Milan
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: toolService and tool
Subject:: parser and dependency parser
Language:: English
Description:: Parsito is a fast open-source dependency parser written in C++. Parsito is based on greedy transition-based parsing, it has very high accuracy and achieves a throughput of 30K words per second. Parsito can be trained on any input data without feature engineering, because it utilizes artificial neural network classifier. Trained models for all treebanks from Universal Dependencies project are available (37 treebanks as of Dec 2015). Parsito is a free software under Mozilla Public License 2.0 (http://www.mozilla.org/MPL/2.0/) and the linguistic models are free for non-commercial use and distributed under CC BY-NC-SA (http://creativecommons.org/licenses/by-nc-sa/4.0/) license, although for some models the original data used to create the model may impose additional licensing conditions. Parsito website http://ufal.mff.cuni.cz/parsito contains download links of both the released packages and trained models, hosts documentation and offers online demo. Parsito development repository http://github.com/ufal/parsito is hosted on GitHub.
Rights:: Mozilla Public License 2.0, http://opensource.org/licenses/MPL-2.0, and PUB

40. PML Tree Query

Creator:: Pajas, Petr, Štěpánek, Jan, and Sedlák, Michal
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: toolService and tool
Subject:: treebank, query, and search
Language:: English
Description:: System for querying annotated treebanks in PML format. The querying uses it own query language with graphical representation. It has two different implementations (SQL and Perl) and several clients (TrEd, browser-based, command line interface).
Rights:: GNU General Public License, version 2, http://www.gnu.org/licenses/gpl-2.0.html, and PUB

31. MCSQ Translation Models (en-de) (v1.0)

32. MCSQ Translation Models (en-ru) (v1.0)

33. Morphological Analyzer for Shipibo-Konibo

34. MSTperl parser

35. MSTperl parser (2015-05-19)

36. MUSCIMarker

37. NameTag 2

38. Open morphology of Finnish

39. Parsito

40. PML Tree Query

Limit your search

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Search

Search Constraints

Search Results

Limit your search

Contributor

Show values starting with

Creator

Show values starting with

Language

Show values starting with

Publisher

Rights

Show values starting with

Subject

Show values starting with

Type

Original context has metadata only

Harvested from