Language: Swedish - LINDAT/CLARIAH-CZ Catalog Search Results

41. SAL (Svenskt associationslexikon)

Type:: lexicalConceptualResource
Language:: Swedish
Description:: appr. 72,000 entries (including many proper names and MWUs), various, including RDF/XML
Rights:: Not specified

42. SALDO

Publisher:: Språkbanken, Dept. of Swedish Language, Göteborg University
Type:: toolService
Language:: Swedish
Description:: SALDO (Swedish Associative Thesaurus version 2) is an extensive lexicon resource for modern Swedish written language created for the purpose of language technology research and for the development of language technology applications. SALDO may be viewed as a basic lexical resouce for a Swedish BLARK. SALDO builds on Swedish Associative Thesaurus, a semantic lexicon for Swedish.
Rights:: Not specified

43. SLäNDa

Creator:: Stymne, Sara and Östman, Carin
Publisher:: Uppsala University
Type:: text and corpus
Subject:: literature, literary fiction, dialogue, narrative, and cited materials
Language:: Swedish
Description:: SLäNDa, the Swedish literature corpus of narrative and dialogue, is a corpus made up of eight Swedish literary novels from the late 19th and early 20th centuries, manually annotated mainly for different aspects of dialogue. The full annotation also contains other cited materials, like thoughts, signs and letters. The main motivation for including these categories as well, is to be able to identify the main narrative, which is all remaining unannotated text.
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

44. SLäNDa 2.0

Creator:: Stymne, Sara and Östman, Carin
Publisher:: Uppsala University
Type:: text and corpus
Subject:: literature, literary fiction, dialogue, narrative, and cited materials
Language:: Swedish
Description:: SLäNDa, the Swedish literature corpus of narrative and dialogue, is a corpus made up of eight Swedish literary novels from the 19th and early 20th centuries, manually annotated mainly for different aspects of dialogue. The full annotation also contains other cited materials, like thoughts, signs and letters. The main motivation for including these categories as well, is to be able to identify the main narrative, which is all remaining unannotated text. SLäNDa version 2.0 extends version 1.0 mainly by adding more data, but also by additional quality control, and a slight modification of the annotation scheme. In addition, the data is organized into test sets with different types of speech marking: quotation marks, dashes, and no marking.
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

45. Slavic Forest, Norwegian Wood (scripts)

Creator:: Rosa, Rudolf, Zeman, Daniel, Mareček, David, and Žabokrtský, Zdeněk
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: suiteOfTools and toolService
Subject:: parsing, dependency parser, universal dependencies, and cross-lingual parsing
Language:: Czech, Slovak, Slovenian, Croatian, Danish, Swedish, and Norwegian
Description:: Tools and scripts used to create the cross-lingual parsing models submitted to VarDial 2017 shared task (https://bitbucket.org/hy-crossNLP/vardial2017), as described in the linked paper. The trained UDPipe models themselves are published in a separate submission (https://lindat.mff.cuni.cz/repository/xmlui/handle/11234/1-1971). For each source (SS, e.g. sl) and target (TT, e.g. hr) language, you need to add the following into this directory: - treebanks (Universal Dependencies v1.4): SS-ud-train.conllu TT-ud-predPoS-dev.conllu - parallel data (OpenSubtitles from Opus): OpenSubtitles2016.SS-TT.SS OpenSubtitles2016.SS-TT.TT !!! If they are originally called ...TT-SS... instead of ...SS-TT..., you need to symlink them (or move, or copy) !!! - target tagging model TT.tagger.udpipe All of these can be obtained from https://bitbucket.org/hy-crossNLP/vardial2017 You also need to have: - Bash - Perl 5 - Python 3 - word2vec (https://code.google.com/archive/p/word2vec/); we used rev 41 from 15th Sep 2014 - udpipe (https://github.com/ufal/udpipe); we used commit 3e65d69 from 3rd Jan 2017 - Treex (https://github.com/ufal/treex); we used commit d27ee8a from 21st Dec 2016 The most basic setup is the sl-hr one (train_sl-hr.sh): - normalization of deprels - 1:1 word-alignment of parallel data with Monolingual Greedy Aligner - simple word-by-word translation of source treebank - pre-training of target word embeddings - simplification of morpho feats (use only Case) - and finally, training and evaluating the parser Both da+sv-no (train_ds-no.sh) and cs-sk (train_cs-sk.sh) add some cross-tagging, which seems to be useful only in specific cases (see paper for details). Moreover, cs-sk also adds more morpho features, selecting those that seem to be very often shared in parallel data. The whole pipeline takes tens of hours to run, and uses several GB of RAM, so make sure to use a powerful computer.
Rights:: GNU General Public License 2 or later (GPL-2.0), http://opensource.org/licenses/GPL-2.0, and PUB

46. Språkbanken (Swedish Language Bank)

Type:: corpus
Language:: Faroese, Icelandic, Spanish, and Swedish
Description:: Mainly written Swedish corpora (all time periods except Runic Swedish; various genres, including learner corpora) and lexicons; some non-Swedish corpora (Faroese, Old Icelandic, Latin, Spanish); Swedish corpora (appr. 200 MW); Swedish lexicons (appr. 220,000 entries total); non-Swedish corpora (appr. 15 MW
Rights:: Not specified

47. SVANTE (SVenska ANdraspråksTExter)

Type:: corpus
Language:: Swedish
Description:: Interlanguage/Learner corpus (essays written by SL Swedish learners with many native languages); appr. 200 kW; POS tags; base forms of words (in TEI/XCES XML format)
Rights:: Not specified

48. Svenska ord/Lexin

Type:: lexicalConceptualResource
Language:: Swedish
Description:: appr. 20,000 entries, XML
Rights:: Not specified

49. Swedish NE annotator

Type:: languageDescription
Language:: Swedish
Description:: Swedish Named Entity annotator
Rights:: Not specified

50. Syntag

Type:: corpus
Language:: Swedish
Description:: appr. 100 kW, functional/dependency (one token per line plus its POS and syntactic annotation[s])
Rights:: Not specified

41. SAL (Svenskt associationslexikon)

42. SALDO

43. SLäNDa

44. SLäNDa 2.0

45. Slavic Forest, Norwegian Wood (scripts)

46. Språkbanken (Swedish Language Bank)

47. SVANTE (SVenska ANdraspråksTExter)

48. Svenska ord/Lexin

49. Swedish NE annotator

50. Syntag

Limit your search

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Show values starting with

Search

Search Constraints

Search Results

Limit your search

Contributor

Show values starting with

Coverage

Creator

Show values starting with

Format

Language

Show values starting with

Publisher

Show values starting with

Rights

Show values starting with

Subject

Show values starting with

Type

Show values starting with

Original context has metadata only

Harvested from