Language: Portuguese - LINDAT/CLARIAH-CZ Catalog Search Results

Publisher:: Institut Universitari de Lingüística Aplicada, Universitat Pompeu Fabra
Type:: lexicalConceptualResource
Language:: Catalan, English, French, Galician, Italian, Portuguese, and Spanish
Description:: A vocabulary resulting from the cooperation of the groups of REALITER network that collects the basic terminology mostly used in texts about Genomics. It contains equivalents in English, Peninsular and Latinamerican Spanish, French, Italian, Galician, Portuguese and Catalan.
Rights:: Not specified

32. Bedřich Katzer 05. 06. 1861, Rokycany (República Tcheca) - 03. 02. 1925, Sarajevo (Bósnia e Herzegovina). Geólogo e viajante /

Creator:: Martínek, Jiří,
Type:: text and studie
Subject:: Vědy o Zemi. Geologické vědy, Katzer, Bedřich,, geologové, cestovatelé, vztahy česko-brazilské, české země 1848-1918, vědy o neživé přírodě, přírodní prostředí, astronomie, Brazílie, Bosna a Hercegovina, světové dějiny 1789-1918, and Habsburská monarchie
Language:: Portuguese
Rights:: unknown

33. Bohemio-alemanes en Chile. Entre el olvido y la asimilación /

Creator:: Witker, Ivan
Subject:: emigrace německá, Němci čeští, Němci chilští, světové dějiny od r. 1918 do současnosti, Chile, migrace, vystěhovalectví, kolonizace, and české země 1848-1918
Language:: Portuguese
Rights:: unknown

34. Brasileiros ilegais em Portugal: uma reflexão sobre as fronteiras nacionais /

Creator:: Oliveira, Sergio P.
Subject:: imigrace, vztahy brazilsko-portugalské, hranice státní, světové dějiny od r. 1918 do současnosti, Brazílie, Portugalsko, and migrace, vystěhovalectví, kolonizace
Language:: Portuguese
Rights:: unknown

35. C4Corpus (CC BY-NC part)

Creator:: Gurevych, Iryna, Habernal, Ivan, and Zayed, Omnia
Publisher:: Technische Universität Darmstadt
Type:: text and corpus
Subject:: CommonCrawl, Creative Commons, Web corpus, and Amazon Web Services
Language:: Afrikaans, Arabic, Bengali, Bulgarian, Czech, Danish, German, Modern Greek (1453-), English, Estonian, Persian, Finnish, French, Hebrew, Hindi, Croatian, Hungarian, Indonesian, Italian, Japanese, Kannada, Korean, Latvian, Lithuanian, Malayalam, Macedonian, Nepali (macrolanguage), Dutch, Norwegian, Panjabi, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Somali, Spanish, Albanian, Swahili (macrolanguage), Swedish, Tamil, Telugu, Tagalog, Thai, Turkish, Ukrainian, Undetermined, Vietnamese, and Chinese
Description:: A large web corpus (over 10 billion tokens) licensed under CreativeCommons license family in 50+ languages that has been extracted from CommonCrawl, the largest publicly available general Web crawl to date with about 2 billion crawled URLs.
Rights:: Creative Commons - Attribution-NonCommercial 4.0 International (CC BY-NC 4.0), http://creativecommons.org/licenses/by-nc/4.0/, and PUB

36. C4Corpus (CC BY-NC-ND part)

Creator:: Gurevych, Iryna, Habernal, Ivan, and Zayed, Omnia
Publisher:: Technische Universität Darmstadt
Type:: text and corpus
Subject:: CommonCrawl, Creative Commons, Web corpus, and Amazon Web Services
Language:: Afrikaans, Arabic, Bengali, Bulgarian, Czech, Danish, German, Modern Greek (1453-), English, Estonian, Persian, Finnish, French, Gujarati, Hebrew, Hindi, Croatian, Hungarian, Indonesian, Italian, Japanese, Kannada, Korean, Latvian, Lithuanian, Malayalam, Marathi, Macedonian, Nepali (macrolanguage), Dutch, Norwegian, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Somali, Spanish, Albanian, Swahili (macrolanguage), Swedish, Tamil, Telugu, Tagalog, Thai, Turkish, Ukrainian, Undetermined, Urdu, Vietnamese, and Chinese
Description:: A large web corpus (over 10 billion tokens) licensed under CreativeCommons license family in 50+ languages that has been extracted from CommonCrawl, the largest publicly available general Web crawl to date with about 2 billion crawled URLs.
Rights:: Creative Commons - Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0), http://creativecommons.org/licenses/by-nc-nd/4.0/, and PUB

37. C4Corpus (CC BY-NC-SA part)

Creator:: Gurevych, Iryna, Habernal, Ivan, and Zayed, Omnia
Publisher:: Technische Universität Darmstadt
Type:: text and corpus
Subject:: CommonCrawl, Creative Commons, Web corpus, and Amazon Web Services
Language:: Afrikaans, Arabic, Bengali, Bulgarian, Czech, Danish, German, Modern Greek (1453-), English, Estonian, Persian, Finnish, French, Gujarati, Hebrew, Hindi, Croatian, Hungarian, Indonesian, Italian, Japanese, Korean, Latvian, Lithuanian, Malayalam, Marathi, Macedonian, Nepali (macrolanguage), Dutch, Norwegian, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Somali, Spanish, Albanian, Swahili (macrolanguage), Swedish, Tamil, Telugu, Tagalog, Thai, Turkish, Ukrainian, Undetermined, Urdu, Vietnamese, and Chinese
Description:: A large web corpus (over 10 billion tokens) licensed under CreativeCommons license family in 50+ languages that has been extracted from CommonCrawl, the largest publicly available general Web crawl to date with about 2 billion crawled URLs.
Rights:: Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0), http://creativecommons.org/licenses/by-nc-sa/4.0/, and PUB

38. C4Corpus (CC BY-ND part)

Creator:: Gurevych, Iryna, Habernal, Ivan, and Zayed, Omnia
Publisher:: Technische Universität Darmstadt
Type:: text and corpus
Subject:: CommonCrawl, Creative Commons, Web corpus, and Amazon Web Services
Language:: Afrikaans, Arabic, Bengali, Bulgarian, Czech, Danish, German, Modern Greek (1453-), English, Estonian, Persian, Finnish, French, Gujarati, Hebrew, Hindi, Croatian, Hungarian, Indonesian, Italian, Japanese, Korean, Latvian, Lithuanian, Malayalam, Macedonian, Dutch, Norwegian, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Somali, Spanish, Albanian, Swahili (macrolanguage), Swedish, Tamil, Tagalog, Thai, Turkish, Ukrainian, Undetermined, Vietnamese, and Chinese
Description:: A large web corpus (over 10 billion tokens) licensed under CreativeCommons license family in 50+ languages that has been extracted from CommonCrawl, the largest publicly available general Web crawl to date with about 2 billion crawled URLs.
Rights:: Creative Commons - Attribution-NoDerivatives 4.0 International (CC BY-ND 4.0), http://creativecommons.org/licenses/by-nc/4.0/, and PUB

39. C4Corpus (CC BY-SA part)

Creator:: Gurevych, Iryna, Habernal, Ivan, and Zayed, Omnia
Publisher:: Technische Universität Darmstadt
Type:: text and corpus
Subject:: CommonCrawl, Creative Commons, Web corpus, and Amazon Web Services
Language:: Afrikaans, Arabic, Bengali, Bulgarian, Czech, Danish, German, Modern Greek (1453-), English, Estonian, Persian, Finnish, French, Gujarati, Hebrew, Hindi, Croatian, Hungarian, Indonesian, Italian, Japanese, Kannada, Korean, Latvian, Lithuanian, Malayalam, Marathi, Macedonian, Nepali (macrolanguage), Dutch, Norwegian, Panjabi, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Somali, Spanish, Albanian, Swahili (macrolanguage), Swedish, Tamil, Telugu, Tagalog, Thai, Turkish, Ukrainian, Undetermined, Urdu, Vietnamese, and Chinese
Description:: A large web corpus (over 10 billion tokens) licensed under CreativeCommons license family in 50+ languages that has been extracted from CommonCrawl, the largest publicly available general Web crawl to date with about 2 billion crawled URLs.
Rights:: Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0), http://creativecommons.org/licenses/by-sa/4.0/, and PUB

40. C4Corpus (CC-BY part)

Creator:: Gurevych, Iryna, Habernal, Ivan, and Zayed, Omnia
Publisher:: Technische Universität Darmstadt
Type:: text and corpus
Subject:: CommonCrawl, Creative Commons, Web corpus, and Amazon Web Services
Language:: Afrikaans, Arabic, Bengali, Bulgarian, Czech, Danish, German, Modern Greek (1453-), English, Estonian, Persian, Finnish, French, Gujarati, Hebrew, Hindi, Croatian, Hungarian, Indonesian, Italian, Japanese, Kannada, Korean, Latvian, Lithuanian, Malayalam, Marathi, Macedonian, Nepali (macrolanguage), Dutch, Norwegian, Panjabi, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Somali, Spanish, Albanian, Swahili (macrolanguage), Swedish, Tamil, Telugu, Tagalog, Thai, Turkish, Ukrainian, Undetermined, Urdu, Vietnamese, and Chinese
Description:: A large web corpus (over 10 billion tokens) licensed under CreativeCommons license family in 50+ languages that has been extracted from CommonCrawl, the largest publicly available general Web crawl to date with about 2 billion crawled URLs.
Rights:: Creative Commons - Attribution 4.0 International (CC BY 4.0), http://creativecommons.org/licenses/by/4.0/, and PUB

41. C4Corpus (publicdomain part)

Creator:: Gurevych, Iryna, Habernal, Ivan, and Zayed, Omnia
Publisher:: Technische Universität Darmstadt
Type:: text and corpus
Subject:: CommonCrawl, Creative Commons, Web corpus, and Amazon Web Services
Language:: Afrikaans, Arabic, Bulgarian, Czech, Danish, German, Modern Greek (1453-), English, Estonian, Persian, Finnish, French, Croatian, Hungarian, Indonesian, Italian, Japanese, Korean, Latvian, Lithuanian, Dutch, Norwegian, Polish, Portuguese, Russian, Slovenian, Somali, Spanish, Swahili (macrolanguage), Swedish, Tagalog, Thai, Turkish, Ukrainian, Undetermined, and Vietnamese
Description:: A large web corpus (over 10 billion tokens) licensed under CreativeCommons license family in 50+ languages that has been extracted from CommonCrawl, the largest publicly available general Web crawl to date with about 2 billion crawled URLs.
Rights:: Public Domain Mark (PD), http://creativecommons.org/publicdomain/mark/1.0/, and PUB

42. Cartas: Recordaçoes e testemunhos do vivenciado /

Creator:: Piccolo, Helga Iracema Landgraf,
Subject:: emigrace německá, vztahy německo-brazilské, korespondence, světové dějiny 1789-1918, Brazílie, Německo, and migrace, vystěhovalectví, kolonizace
Language:: Portuguese
Rights:: unknown

43. Čestmír Loukotka 12. 11. 1895, Chrášťany (República Tcheca) - 13. 04. 1966, Praga (República Tcheca). Linguista e antropólogo /

Creator:: Křížová, Markéta,
Type:: text and studie
Subject:: Dějiny civilizace. Kulturní dějiny, Loukotka, Čestmír,, lingvisté, antropologové, Československo 1918-1992, and dějiny vědy, umění, kultury a techniky, kulturní vztahy
Language:: Portuguese
Rights:: unknown

44. Cien años de rearme - la emigración checo-austro-húngara a Guatemala entre 1880 y 1980 /

Creator:: Dietrich, Wolfgang
Subject:: emigrace česká, emigrace hospodářská, Češi guatemalští, Guatemala, migrace, vystěhovalectví, kolonizace, české země 1848-1918, and Československo 1918-1992
Language:: Portuguese
Rights:: unknown

45. Com uma imensa alegria :

Creator:: Jorge, Joaquim Pires,
Type:: text and autobiografie
Subject:: Politika, Jorge, Joaquim Pires,, komunisté portugalští, antifašismus, Portugalsko, světové dějiny od r. 1918 do současnosti, odboj, odpor, antifašismus, antikomunismus, and politické dějiny, politici
Language:: Portuguese
Rights:: unknown

46. Comenius :

Creator:: Covello, Sergio Carlos
Type:: text and studie
Subject:: Organizace výuky a vzdělávání, Komenský, Jan Amos,, myšlení pedagogické, české země 1526-1792, and školství, pedagogika, učitelé, péče o mládež
Language:: Portuguese
Rights:: unknown

47. Comenius :

Creator:: Kulesza, Wojciech Andrzej
Type:: text and studie
Subject:: Organizace výuky a vzdělávání, Komenský, Jan Amos,, myšlení pedagogické, pedagogika, české země 1526-1792, and školství, pedagogika, učitelé, péče o mládež
Language:: Portuguese
Rights:: unknown

48. Comenius no Brasil :

Creator:: Araújo Sampaio, Bohumila de
Type:: text and studie
Subject:: Organizace výuky a vzdělávání, Komenský, Jan Amos,, teologové, filozofové, vztahy česko-brazilské, světové dějiny 1789-1918, světové dějiny od r. 1918 do současnosti, Brazílie, české země 1526-1792, and dějiny vědy, umění, kultury a techniky, kulturní vztahy
Language:: Portuguese
Rights:: unknown

49. Comenius, o fundador de Pedagogia Moderna e o seu legado para a humanidade /

Creator:: Pánek, Jaroslav,
Type:: text and studie
Subject:: Výchova a vzdělávání, Komenský, Jan Amos,, myšlení pedagogické, filozofové čeští, české země 1526-1792, and školství, pedagogika, učitelé, péče o mládež
Language:: Portuguese
Rights:: unknown

50. COMPARA : Portuguese - English parallel translation corpus

Type:: corpus
Language:: English and Portuguese
Description:: bi-directional parallel corpus based on an open-ended collection of Portuguese-English and English-Portuguese source-texts and translations. Searchable via the IMS Corpus Query Processor and the DISPARA interface
Rights:: Not specified

51. CoNLL 2017 and 2018 Shared Task Blind and Preprocessed Test Data

Creator:: Zeman, Daniel and Straka, Milan
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: tokenization, word segmentation, morphology, tagging, syntax, parsing, and universal dependencies
Language:: Afrikaans, Arabic, Breton, Bulgarian, Russia Buriat, Catalan, Czech, Church Slavic, Danish, German, Modern Greek (1453-), English, Estonian, Basque, Faroese, Persian, Finnish, French, Old French (842-ca. 1400), Irish, Galician, Gothic, Ancient Greek (to 1453), Hebrew, Hindi, Croatian, Upper Sorbian, Hungarian, Armenian, Indonesian, Italian, Japanese, Kazakh, Northern Kurdish, Korean, Latin, Latvian, Dutch, Norwegian, Nigerian Pidgin, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Northern Sami, Spanish, Serbian, Swedish, Thai, Turkish, Uighur, Ukrainian, Urdu, Vietnamese, and Chinese
Description:: CoNLL 2017 and 2018 shared tasks: Multilingual Parsing from Raw Text to Universal Dependencies This package contains the test data in the form in which they ware presented to the participating systems: raw text files and files preprocessed by UDPipe. The metadata.json files contain lists of files to process and to output; README files in the respective folders describe the syntax of metadata.json. For full training, development and gold standard test data, see Universal Dependencies 2.0 (CoNLL 2017) Universal Dependencies 2.2 (CoNLL 2018) See the download links at http://universaldependencies.org/. For more information on the shared tasks, see http://universaldependencies.org/conll17/ http://universaldependencies.org/conll18/ Contents: conll17-ud-test-2017-05-09 ... CoNLL 2017 test data conll18-ud-test-2018-05-06 ... CoNLL 2018 test data conll18-ud-test-2018-05-06-for-conll17 ... CoNLL 2018 test data with metadata and filenames modified so that it is digestible by the 2017 systems.
Rights:: Licence Universal Dependencies v2.2, https://lindat.mff.cuni.cz/repository/xmlui/page/licence-UD-2.2, and PUB

52. CoNLL 2017 Shared Task System Outputs

Creator:: Zeman, Daniel, Potthast, Martin, Straka, Milan, Popel, Martin, Dozat, Timothy, Qi, Peng, Manning, Christopher, Shi, Tianze, Wu, Felix G., Chen, Xilun, Cheng, Yao, Björkelund, Anders, Falenska, Agnieszka, Yu, Xiang, Kuhn, Jonas, Che, Wanxiang, Guo, Jiang, Wang, Yuxuan, Zheng, Bo, Zhao, Huaipeng, Liu, Yang, Teng, Dechuan, Liu, Ting, Lim, Kyungtae, Poibeau, Thierry, Sato, Motoki, Manabe, Hitoshi, Noji, Hiroshi, Matsumoto, Yuji, Kırnap, Ömer, Önder, Berkay Furkan, Yuret, Deniz, Straková, Jana, Vania, Clara, Zhang, Xingxing, Lopez, Adam, Heinecke, Johannes, Asadullah, Munshi, Kanerva, Jenna, Luotolahti, Juhani, Ginter, Filip, Kuan, Yu, Sofroniev, Pavel, Schill, Erik, Hinrichs, Erhard, Nguyen, Dat Quoc, Dras, Mark, Johnson, Mark, Qian, Xian, Vilares, David, Gómez-Rodríguez, Carlos, Aufrant, Lauriane, Wisniewski, Guillaume, Yvon, François, Dumitrescu, Stefan Daniel, Boroş, Tiberiu, Tufiş, Dan, Das, Ayan, Zaffar, Affan, Sarkar, Sudeshna, Wang, Hao, Zhao, Hai, Zhang, Zhisong, Hornby, Ryan, Taylor, Clark, Park, Jungyeul, de Lhoneux, Miryam, Shao, Yan, Basirat, Ali, Kiperwasser, Eliyahu, Stymne, Sara, Goldberg, Yoav, Nivre, Joakim, Akkuş, Burak Kerim, Azizoglu, Heval, Cakici, Ruket, Moor, Christophe, Merlo, Paola, Henderson, James, Wang, Haozhou, Ji, Tao, Wu, Yuanbin, Lan, Man, de la Clergerie, Eric, Sagot, Benoît, Seddah, Djamé, More, Amir, Tsarfaty, Reut, Kanayama, Hiroshi, Muraoka, Masayasu, Yoshikawa, Katsumasa, Garcia, Marcos, and Gamallo, Pablo
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: dependency parser and parsebank
Language:: Arabic, Bulgarian, Russia Buriat, Czech, Catalan, Church Slavic, Danish, German, Modern Greek (1453-), English, Spanish, Estonian, Basque, Persian, Finnish, French, Irish, Galician, Gothic, Ancient Greek (to 1453), Hebrew, Hindi, Croatian, Upper Sorbian, Hungarian, Indonesian, Italian, Japanese, Kazakh, Northern Kurdish, Korean, Latin, Latvian, Dutch, Norwegian, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Northern Sami, Swedish, Turkish, Uighur, Ukrainian, Urdu, Vietnamese, and Chinese
Description:: This package contains the system outputs from the CoNLL 2017 Shared Task in Multilingual Parsing from Raw Text to Universal Dependencies.
Rights:: Licence Universal Dependencies v2.0, https://lindat.mff.cuni.cz/repository/xmlui/page/licence-UD-2.0, and PUB

53. CoNLL 2018 Shared Task System Outputs

Creator:: Zeman, Daniel, Potthast, Martin, Duthoo, Elie, Mesnard, Olivier, Rybak, Piotr, Wróblewska, Alina, Che, Wanxiang, Liu, Yijia, Wang, Yuxuan, Zheng, Bo, Liu, Ting, Li, Zuchao, He, Shexia, Zhang, Zhuosheng, Zhao, Hai, Wu, Yingting, Tong, Jia-Jun, Nguyen, Dat Quoc, Verspoor, Karin, Wan, Hui, Naseem, Tahira, Lee, Young-Suk, Castelli, Vittorio, Ballesteros, Miguel, Hershcovich, Daniel, Abend, Omri, Rappoport, Ari, Smith, Aaron, Bohnet, Bernd, de Lhoneux, Miryam, Nivre, Joakim, Shao, Yan, Stymne, Sara, Kırnap, Ömer, Dayanık, Erenay, Yuret, Deniz, Kanerva, Jenna, Ginter, Filip, Miekka, Niko, Leino, Akseli, Salakoski, Tapio, Lim, KyungTae, Park, Cheoneum, Lee, Changki, Poibeau, Thierry, Bhat, Riyaz Ahmad, Bhat, Irshad, Bangalore, Srinivas, Qi, Peng, Dozat, Timothy, Zhang, Yuhao, Manning, Christopher, Boroș, Tiberiu, Dumitrescu, Stefan Daniel, Burtica, Ruxandra, Arakelyan, Gor, Hambardzumyan, Karen, Khachatrian, Hrant, Rosa, Rudolf, Mareček, David, Straka, Milan, Seker, Amit, More, Amir, Tsarfaty, Reut, Önder, Berkay Furkan, Gümeli, Can, Jawahar, Ganesh, Muller, Benjamin, Fethi, Amal, Martin, Louis, Villemonte de la Clergerie, Eric, Sagot, Benoît, Seddah, Djamé, Özateş, Şaziye Betül, Özgür, Arzucan, Gungor, Tunga, Öztürk, Balkız, Ji, Tao, Liu, Yufang, Wang, Yijun, Wu, Yuanbin, Lan, Man, Chen, Danlu, Lin, Mengxiao, Hu, Zhifeng, and Qiu, Xipeng
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: parsed data, conllu, and universal dependencies
Language:: Afrikaans, Arabic, Breton, Bulgarian, Russia Buriat, Catalan, Czech, Church Slavic, Danish, German, Modern Greek (1453-), English, Estonian, Basque, Faroese, Persian, Finnish, French, Old French (842-ca. 1400), Irish, Galician, Gothic, Ancient Greek (to 1453), Hebrew, Hindi, Croatian, Upper Sorbian, Hungarian, Armenian, Indonesian, Italian, Japanese, Kazakh, Northern Kurdish, Korean, Latin, Latvian, Dutch, Norwegian, Nigerian Pidgin, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Northern Sami, Spanish, Serbian, Swedish, Thai, Turkish, Uighur, Ukrainian, Urdu, Vietnamese, and Chinese
Description:: Test data parsed by systems submitted to the CoNLL 2018 UD parsing shared task.
Rights:: Licence Universal Dependencies v2.2, https://lindat.mff.cuni.cz/repository/xmlui/page/licence-UD-2.2, and PUB

54. CORP-ORAL Spontaneous Speech Corpus

Publisher:: Instituto de Linguística Teórica e Computacional
Type:: corpus
Language:: Portuguese
Description:: The aim of the CORP-ORAL project is to build a corpus of spontaneous European Portuguese speech available for the training of speech synthesis and recognition systems as well as phonetic, phonological, lexical, morphological and syntactic studies. The corpus contains the recording of 60 hours of conversations between two European Portuguese speakers per conversation (at a time). The entire corpus will be completed with orthographic transcription and the prosodic marking of speech breaks/boundaries as well as phonetic transcription of a selection of chunks. CORP-ORAL is built from scratch with the explicit goal of becoming entirely available on the internet to the scientific community and the public in general.
Rights:: Not specified

55. Corpus CLUVI

Publisher:: TALG Research Group (University of Vigo)
Type:: corpus
Language:: Basque, Catalan, English, French, Galician, German, Portuguese, and Spanish
Description:: Parallel corpus, 22 million words
Rights:: Not specified

56. CorpusExplorer

Creator:: Rüdiger, Jan Oliver
Publisher:: Jan Oliver Rüdiger
Type:: tool and toolService
Subject:: Corpus Linguisitics, NLP, conll, tei, XML, nlp, Natural Language Processing, linguistics, Linguistics, Computational Linguistics, corpus processing, tagger, POS tagger, lemmatization, text cleaning, CommonCrawl, epub, JSON, Twitter, Pandoc, Wikipedia, digital data, DTA, DSpin, MySQL, ElasticSearch, TextGrid, text corpora, TigerXML, and WeblichtXML
Language:: German, English, French, Italian, Dutch, Spanish, Polish, Arabic, Chinese, and Portuguese
Description:: Software for corpus linguists and text/data mining enthusiasts. The CorpusExplorer combines over 45 interactive visualizations under a user-friendly interface. Routine tasks such as text acquisition, cleaning or tagging are completely automated. The simple interface supports the use in university teaching and leads users/students to fast and substantial results. The CorpusExplorer is open for many standards (XML, CSV, JSON, R, etc.) and also offers its own software development kit (SDK). Source code available at https://github.com/notesjor/corpusexplorer2.0
Rights:: Not specified

57. Crise e queda dos governos PS.

Creator:: Cunhal, Álvaro,
Type:: text and monografie
Subject:: Politické strany a hnutí, Dějiny států a území na Pyrenejském poloostrově, Cunhal, Álvaro,, dějiny hospodářské, dějiny sociální, Portugalsko, politické dějiny, politici, světové dějiny od r. 1945 do současnosti, and hospodářské dějiny
Language:: Portuguese
Rights:: unknown

58. DaMuEL 1.0: A Large Multilingual Dataset for Entity Linking

Creator:: Kubeša, David and Straka, Milan
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: entity linking, NEL, NER, dataset, and knowledge base
Language:: Afrikaans, Arabic, Armenian, Basque, Belarusian, Bulgarian, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, Galician, German, Hebrew, Hindi, Hungarian, Indonesian, Irish, Italian, Japanese, Korean, Latin, Latvian, Lithuanian, Maltese, Marathi, Modern Greek (1453-), Northern Sami, Norwegian Nynorsk, Persian, Polish, Portuguese, Romanian, Russian, Scottish Gaelic, Serbian, Slovak, Slovenian, Spanish, Swedish, Tamil, Telugu, Uighur, Ukrainian, Urdu, Vietnamese, and Wolof
Description:: We present DaMuEL, a large Multilingual Dataset for Entity Linking containing data in 53 languages. DaMuEL consists of two components: a knowledge base that contains language-agnostic information about entities, including their claims from Wikidata and named entity types (PER, ORG, LOC, EVENT, BRAND, WORK_OF_ART, MANUFACTURED); and Wikipedia texts with entity mentions linked to the knowledge base, along with language-specific text from Wikidata such as labels, aliases, and descriptions, stored separately for each language. The Wikidata QID is used as a persistent, language-agnostic identifier, enabling the combination of the knowledge base with language-specific texts and information for each entity. Wikipedia documents deliberately annotate only a single mention for every entity present; we further automatically detect all mentions of named entities linked from each document. The dataset contains 27.9M named entities in the knowledge base and 12.3G tokens from Wikipedia texts. The dataset is published under the CC BY-SA licence.
Rights:: Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0), http://creativecommons.org/licenses/by-sa/4.0/, and PUB

59. De Hislampa a Porthum. De la delimitación del corpus del humanismo en Portugal /

Creator:: Sánchez Tarrio, Ana María
Type:: studie
Subject:: Filozofie, dějiny portugalské, humanismus, historiografie, Portugalsko, světové dějiny 1492-1648, světové dějiny středověku (do r. 1492), dějiny ideí, ideologie, and historiografie, vědecké projekty
Language:: Portuguese
Rights:: unknown

60. Deep Universal Dependencies 2.4

Creator:: Zeman, Daniel and Droganova, Kira
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: semantic dependency and universal dependencies
Language:: Afrikaans, Assyrian Neo-Aramaic, Akkadian, Amharic, Arabic, Belarusian, Breton, Bulgarian, Russia Buriat, Catalan, Czech, Church Slavic, Mandarin Chinese, Coptic, Welsh, Danish, German, Modern Greek (1453-), English, Estonian, Basque, Faroese, Finnish, French, Irish, Gothic, Ancient Greek (to 1453), Mbyá Guaraní, Hebrew, Hindi, Croatian, Upper Sorbian, Hungarian, Armenian, Indonesian, Italian, Japanese, Kazakh, Northern Kurdish, Korean, Komi-Zyrian, Karelian, Latin, Latvian, Lithuanian, Literary Chinese, Marathi, Erzya, Dutch, Norwegian, Old Russian, Nigerian Pidgin, Polish, Portuguese, Romanian, Russian, Sanskrit, Slovak, Slovenian, Northern Sami, Spanish, Serbian, Swedish, Tamil, Tagalog, Turkish, Ukrainian, Urdu, Vietnamese, Warlpiri, Wolof, Yoruba, and Galician
Description:: Deep Universal Dependencies is a collection of treebanks derived semi-automatically from Universal Dependencies (http://hdl.handle.net/11234/1-2988). It contains additional deep-syntactic and semantic annotations. Version of Deep UD corresponds to the version of UD it is based on. Note however that some UD treebanks have been omitted from Deep UD.
Rights:: Licence Universal Dependencies v2.4, https://lindat.mff.cuni.cz/repository/xmlui/page/licence-UD-2.4, and PUB

61. Deep Universal Dependencies 2.5

Creator:: Zeman, Daniel and Droganova, Kira
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: semantic dependency and universal dependencies
Language:: Afrikaans, Assyrian Neo-Aramaic, Akkadian, Amharic, Arabic, Belarusian, Breton, Bulgarian, Russia Buriat, Catalan, Czech, Church Slavic, Mandarin Chinese, Coptic, Welsh, Danish, German, Modern Greek (1453-), English, Estonian, Basque, Faroese, Finnish, French, Irish, Gothic, Ancient Greek (to 1453), Mbyá Guaraní, Hebrew, Hindi, Croatian, Upper Sorbian, Hungarian, Armenian, Indonesian, Italian, Japanese, Kazakh, Northern Kurdish, Korean, Komi-Zyrian, Karelian, Latin, Latvian, Lithuanian, Literary Chinese, Marathi, Erzya, Dutch, Norwegian, Old Russian, Nigerian Pidgin, Polish, Portuguese, Romanian, Russian, Sanskrit, Slovak, Slovenian, Northern Sami, Spanish, Serbian, Swedish, Tamil, Tagalog, Turkish, Ukrainian, Urdu, Vietnamese, Warlpiri, Wolof, Yoruba, Galician, Bhojpuri, Komi-Permyak, Livvi, Moksha, Scottish Gaelic, and Skolt Sami
Description:: Deep Universal Dependencies is a collection of treebanks derived semi-automatically from Universal Dependencies (http://hdl.handle.net/11234/1-3105). It contains additional deep-syntactic and semantic annotations. Version of Deep UD corresponds to the version of UD it is based on. Note however that some UD treebanks have been omitted from Deep UD.
Rights:: Licence Universal Dependencies v2.5, https://lindat.mff.cuni.cz/repository/xmlui/page/licence-UD-2.5, and PUB

62. Deep Universal Dependencies 2.6

Creator:: Zeman, Daniel and Droganova, Kira
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: semantic dependency and universal dependencies
Language:: Afrikaans, Assyrian Neo-Aramaic, Akkadian, Amharic, Arabic, Belarusian, Breton, Bulgarian, Russia Buriat, Catalan, Czech, Church Slavic, Mandarin Chinese, Coptic, Welsh, Danish, German, Modern Greek (1453-), English, Estonian, Basque, Faroese, Finnish, French, Irish, Gothic, Ancient Greek (to 1453), Mbyá Guaraní, Hebrew, Hindi, Croatian, Upper Sorbian, Hungarian, Armenian, Indonesian, Italian, Japanese, Kazakh, Northern Kurdish, Korean, Komi-Zyrian, Karelian, Latin, Latvian, Lithuanian, Literary Chinese, Marathi, Erzya, Dutch, Norwegian, Old Russian, Nigerian Pidgin, Polish, Portuguese, Romanian, Russian, Sanskrit, Slovak, Slovenian, Northern Sami, Spanish, Serbian, Swedish, Tamil, Tagalog, Turkish, Ukrainian, Urdu, Vietnamese, Warlpiri, Wolof, Yoruba, Galician, Bhojpuri, Komi-Permyak, Livvi, Moksha, Scottish Gaelic, Skolt Sami, Icelandic, Albanian, and Persian
Description:: Deep Universal Dependencies is a collection of treebanks derived semi-automatically from Universal Dependencies (http://hdl.handle.net/11234/1-3226). It contains additional deep-syntactic and semantic annotations. Version of Deep UD corresponds to the version of UD it is based on. Note however that some UD treebanks have been omitted from Deep UD.
Rights:: Licence Universal Dependencies v2.6, https://lindat.mff.cuni.cz/repository/xmlui/page/license-ud-2.6, and PUB

63. Deep Universal Dependencies 2.7

Creator:: Zeman, Daniel and Droganova, Kira
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: semantic dependency and universal dependencies
Language:: Afrikaans, Assyrian Neo-Aramaic, Akkadian, Amharic, Arabic, Belarusian, Breton, Bulgarian, Russia Buriat, Catalan, Czech, Church Slavic, Mandarin Chinese, Coptic, Welsh, Danish, German, Modern Greek (1453-), English, Estonian, Basque, Faroese, Finnish, French, Irish, Gothic, Ancient Greek (to 1453), Mbyá Guaraní, Hebrew, Hindi, Croatian, Upper Sorbian, Hungarian, Armenian, Indonesian, Italian, Japanese, Kazakh, Northern Kurdish, Korean, Komi-Zyrian, Karelian, Latin, Latvian, Lithuanian, Literary Chinese, Marathi, Erzya, Dutch, Norwegian, Old Russian, Nigerian Pidgin, Polish, Portuguese, Romanian, Russian, Sanskrit, Slovak, Slovenian, Northern Sami, Spanish, Serbian, Swedish, Tamil, Tagalog, Turkish, Ukrainian, Urdu, Vietnamese, Warlpiri, Wolof, Yoruba, Galician, Bhojpuri, Komi-Permyak, Livvi, Moksha, Scottish Gaelic, Skolt Sami, Icelandic, Albanian, Persian, Akuntsu, Apurinã, Khunsari, Manx, Mundurukú, Nayini, Soi, South Levantine Arabic, and Tupinambá
Description:: Deep Universal Dependencies is a collection of treebanks derived semi-automatically from Universal Dependencies (http://hdl.handle.net/11234/1-3424). It contains additional deep-syntactic and semantic annotations. Version of Deep UD corresponds to the version of UD it is based on. Note however that some UD treebanks have been omitted from Deep UD.
Rights:: Licence Universal Dependencies v2.7, https://lindat.mff.cuni.cz/repository/xmlui/page/license-ud-2.7, and PUB

64. Deep Universal Dependencies 2.8

Creator:: Zeman, Daniel and Droganova, Kira
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: semantic dependency and universal dependencies
Language:: Afrikaans, Assyrian Neo-Aramaic, Akkadian, Amharic, Arabic, Belarusian, Breton, Bulgarian, Russia Buriat, Catalan, Czech, Church Slavic, Mandarin Chinese, Coptic, Welsh, Danish, German, Modern Greek (1453-), English, Estonian, Basque, Faroese, Finnish, French, Irish, Gothic, Ancient Greek (to 1453), Mbyá Guaraní, Hebrew, Hindi, Croatian, Upper Sorbian, Hungarian, Armenian, Indonesian, Italian, Japanese, Kazakh, Northern Kurdish, Korean, Komi-Zyrian, Karelian, Latin, Latvian, Lithuanian, Literary Chinese, Marathi, Erzya, Dutch, Norwegian, Old Russian, Nigerian Pidgin, Polish, Portuguese, Romanian, Russian, Sanskrit, Slovak, Slovenian, Northern Sami, Spanish, Serbian, Swedish, Tamil, Tagalog, Turkish, Ukrainian, Urdu, Vietnamese, Warlpiri, Wolof, Yoruba, Galician, Bhojpuri, Komi-Permyak, Livvi, Moksha, Scottish Gaelic, Skolt Sami, Icelandic, Albanian, Persian, Akuntsu, Apurinã, Khunsari, Manx, Mundurukú, Nayini, Soi, South Levantine Arabic, Tupinambá, Beja, Western Frisian, Urubú-Kaapor, Kangri, K'iche', Low German, Makuráp, Western Armenian, and Central Siberian Yupik
Description:: Deep Universal Dependencies is a collection of treebanks derived semi-automatically from Universal Dependencies (http://hdl.handle.net/11234/1-3687). It contains additional deep-syntactic and semantic annotations. Version of Deep UD corresponds to the version of UD it is based on. Note however that some UD treebanks have been omitted from Deep UD.
Rights:: Licence Universal Dependencies v2.8, https://lindat.mff.cuni.cz/repository/xmlui/page/license-ud-2.8, and PUB

65. Deltacorpus

Creator:: Mareček, David, Yu, Zhiwei, Zeman, Daniel, and Žabokrtský, Zdeněk
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: part of speech, tagging, semi-supervised, and cross-language
Language:: Belarusian, Bosnian, Bulgarian, Czech, Serbo-Croatian, Croatian, Upper Sorbian, Macedonian, Polish, Russian, Slovak, Slovenian, Serbian, Ukrainian, Latvian, Lithuanian, Afrikaans, Danish, German, English, Faroese, Western Frisian, Swiss German, Icelandic, Limburgan, Luxembourgish, Low German, Dutch, Norwegian Nynorsk, Norwegian, Scots, Swedish, Yiddish, Aragonese, Asturian, Catalan, French, Galician, Haitian, Italian, Latin, Lombard, Neapolitan, Piemontese, Portuguese, Romanian, Spanish, Venetian, Walloon, Breton, Welsh, Scottish Gaelic, Irish, Modern Greek (1453-), Armenian, Albanian, Dimli (individual language), Persian, Gilaki, Kurdish, Tajik, Bengali, Bishnupriya, Gujarati, Fiji Hindi, Hindi, Marathi, Nepali (macrolanguage), Urdu, Amharic, Arabic, Egyptian Arabic, Hebrew, Estonian, Finnish, Hungarian, Basque, Georgian, Chuvash, Azerbaijani, Turkish, Uzbek, Kazakh, Tatar, Yakut, Korean, Mongolian, Telugu, Kannada, Malayalam, Tamil, Newari, Vietnamese, Indonesian, Javanese, Malagasy, Maori, Malay (macrolanguage), Pampanga, Sundanese, Tagalog, Waray (Philippines), Swahili (macrolanguage), Esperanto, Ido, Interlingua (International Auxiliary Language Association), and Volapük
Description:: Texts in 107 languages from the W2C corpus (http://hdl.handle.net/11858/00-097C-0000-0022-6133-9), first 1,000,000 tokens per language, tagged by the delexicalized tagger described in Yu et al. (2016, LREC, Portorož, Slovenia).
Rights:: Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0), http://creativecommons.org/licenses/by-sa/4.0/, and PUB

66. Deltacorpus 1.1

Creator:: Mareček, David, Yu, Zhiwei, Zeman, Daniel, and Žabokrtský, Zdeněk
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: text and corpus
Subject:: part of speech, tagging, semi-supervised, and cross-language
Language:: Belarusian, Bosnian, Bulgarian, Czech, Serbo-Croatian, Croatian, Upper Sorbian, Macedonian, Polish, Russian, Slovak, Slovenian, Serbian, Ukrainian, Latvian, Lithuanian, Afrikaans, Danish, German, English, Faroese, Western Frisian, Swiss German, Icelandic, Limburgan, Luxembourgish, Low German, Dutch, Norwegian Nynorsk, Norwegian, Scots, Swedish, Yiddish, Aragonese, Asturian, Catalan, French, Galician, Haitian, Italian, Latin, Lombard, Neapolitan, Piemontese, Portuguese, Romanian, Spanish, Venetian, Walloon, Breton, Welsh, Scottish Gaelic, Irish, Modern Greek (1453-), Armenian, Albanian, Dimli (individual language), Persian, Gilaki, Kurdish, Tajik, Bengali, Bishnupriya, Gujarati, Fiji Hindi, Hindi, Marathi, Nepali (macrolanguage), Urdu, Amharic, Arabic, Egyptian Arabic, Hebrew, Estonian, Finnish, Hungarian, Basque, Georgian, Chuvash, Azerbaijani, Turkish, Uzbek, Kazakh, Tatar, Yakut, Korean, Mongolian, Telugu, Kannada, Malayalam, Tamil, Newari, Vietnamese, Indonesian, Javanese, Malagasy, Maori, Malay (macrolanguage), Pampanga, Sundanese, Tagalog, Waray (Philippines), Swahili (macrolanguage), Esperanto, Ido, Interlingua (International Auxiliary Language Association), and Volapük
Description:: Texts in 107 languages from the W2C corpus (http://hdl.handle.net/11858/00-097C-0000-0022-6133-9), first 1,000,000 tokens per language, tagged by the delexicalized tagger described in Yu et al. (2016, LREC, Portorož, Slovenia). Changes in version 1.1: 1. Universal Dependencies tagset instead of the older and smaller Google Universal POS tagset. 2. SVM classifier trained on Universal Dependencies 1.2 instead of HamleDT 2.0. 3. Balto-Slavic languages, Germanic languages and Romance languages were tagged by classifier trained only on the respective group of languages. Other languages were tagged by a classifier trained on all available languages. The "c7" combination from version 1.0 is no longer used.
Rights:: Creative Commons - Attribution-ShareAlike 4.0 International (CC BY-SA 4.0), http://creativecommons.org/licenses/by-sa/4.0/, and PUB

67. Der Subjektivismus der Husserlschen und die Forderung einer asubjektiven Phänomenologie

Creator:: Jan Patočka
Publisher:: Sborník prací filosofické fakulty brněnské university 19–20 (1971), Řada uměnovědná (F), č. 14–15, str. 11–26. Stať. něm.
Type:: Text
Subject:: 1970/6, 1971, 1988/30, 1991/2, 2004/10, 2009/1, cs, es, fr, pt, SS-7/Fen-II, and Stať. něm.
Language:: Czech, French, Portuguese, and Spanish
Rights:: open access and Rights holder: Archiv Jana Patočky, z.s.

68. Der Subjektivismus der Husserlschen und die Möglichkeit einer asubjektiven Phänomenologie

Creator:: Jan Patočka
Publisher:: Philosophische Perspektiven, ein Jahrbuch, sv. 2, ed. R. Berlinger a E. Fink, Frankfurt/M. (v. Klostermann) 1970, str. 317–334. Stať. něm.
Type:: Text
Subject:: 1970, 1988/30, 1991/2, 2004/10, 2009/1, cn, cs, de, es, fr, hu, pt, SS-7/Fen-II, and Stať. něm.
Language:: German, Czech, French, Hungarian, Portuguese, and Spanish
Rights:: open access and Rights holder: Archiv Jana Patočky, z.s.

69. Do 25 de Novembro às Eleiçoes para a Assembleia da República /

Creator:: Cunhal, Álvaro,
Type:: text and projevy
Subject:: Dějiny států a území na Pyrenejském poloostrově, Cunhal, Álvaro,, politici portugalští, projevy politické, strany politické, strany politické komunistické, and Portugalsko
Language:: Portuguese
Rights:: unknown

70. Do Moldava ao Douro :

Creator:: Burmester, Elisabeth,
Type:: text and paměti
Subject:: Dějiny Česka a Slovenska, Ringhofferové (rod), podnikatelé průmysloví, rody a rodiny, Československo 1918-1992, šlechta, buržoazie, měšťanstvo, podnikatelé, and české země 1792-1918
Language:: Portuguese
Rights:: unknown

71. DOESTE v0.5

Creator:: Martins, Mário, Janssen, Maarten, Santos, Taiza, Lopes, Raquel, and Souza, Thiago
Publisher:: Federal Rural University of the Semiarid Region
Type:: text and corpus
Subject:: Developmental corpus, Writing development, and School-age language development
Language:: Portuguese
Description:: DOESTE v0.5 is a set of developmental corpora of texts written by Brazilian and Portuguese school-age children and adolescents. It is a work in progress. The texts written by monolingual children and adolescents in European Portuguese were collected between September 2011 and January 2012, from different public schools in Lisbon (Portugal). It is composed of 244 narrative (n=122) and argumentative (n=122) texts. The subjects (51% female and 49% male) are students enroled in the 5th grade (n=52; mean age=10.19), in the 7th grade (n=92; mean age=12.33) and in the 10th grade (n=100; mean age=15.16) from the Portuguese basic schooling. The subcorpus of Portuguese texts is fully tokenized and morphologically annotated, in addition to presenting the sentence occurrences. The texts written by monolingual children and adolescents in Brazilian Portuguese have been collected since 2017, from different public schools in three cities in Rio Grande do Norte (Brazil). It is currently composed of narrative (n=225) and argumentative (n=225) texts. The subjects (53% female and 47% male) are students enroled in the 5th grade (n=68; mean age=11.13), in the 9th grade (n=82; mean age=15.32) and in the 12th grade (n=224; mean age=17.96) from the Brazilian basic schooling. The subcorpora of Brazilian texts is still in the compilation, but a large part is already searchable, being tokenized and morphologically annotated. The Brazilian subcorpus also presents itself with the original transcripts, along original images. Portuguese and Brazilian texts were collected from similar tasks: Narrative-based task: Tell a remarkable story (real or imagined) that you and your best friend lived during the last school vacation. Argumentative based-task: Do you think social networks (Facebook, Twitter, Google+, Windows Live Space, etc.) are important today? Write a text to be published on your school's blog where you express your opinion on social networks. In this text, you must say whether you are for or against the existence of social networks. Don't forget to justify your opinion! The next version of DOESTE intends to present semantic annotations and clause and t-unit segmentation. DOESTE v0.5 is developed and maintained by the Educational Linguistics Research Group (LEd), based at the Federal Rural University of the Semiarid Region (UFERSA). DOESTE v0.5 by Mário Martins et al. is licensed under CC BY-NC-ND 4.0.
Rights:: Creative Commons - Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0), http://creativecommons.org/licenses/by-nc-nd/4.0/, and PUB

72. Dois movimentos integralistas em Portugal e no Brasil /

Creator:: Vrbata, Aleš,
Subject:: integralismus, ideologie, hnutí politická, politické dějiny, politici, světové dějiny 1789-1918, světové dějiny 1918-1945, Brazílie, and Portugalsko
Language:: Portuguese
Rights:: unknown

73. Don Juan Szychowski, Un pionero polaco, un nuevo modelo de trabajo para Argentina /

Creator:: Kojrowicz, Claudia Stefanetti
Subject:: Szychowski, Juan,, emigrace polská, Poláci argentinští, vztahy argentinsko-polské, světové dějiny od r. 1918 do současnosti, Argentina, Polsko, and migrace, vystěhovalectví, kolonizace
Language:: Portuguese
Rights:: unknown

74. ECI Multilingual Text

Publisher:: HCRC
Type:: corpus
Language:: Portuguese
Description:: Parallel corpus
Rights:: Not specified

75. Economia portuguesa :

Creator:: Pimenta, Carlos
Type:: text and monografie
Subject:: Dějiny států a území na Pyrenejském poloostrově, ekonomové, dějiny hospodářské, Portugalsko, hospodářské dějiny, and světové dějiny od r. 1945 do současnosti
Language:: Portuguese
Rights:: unknown

76. Eduard Ingriš 11. 02. 1905, Zlonice (República Tcheca) - 12. 01. 1991, Reno (EUA). Compositor, maestro, explorador, documentarista, cineasta e fotógrafo /

Creator:: Náplava, Miroslav,
Type:: text and studie
Subject:: Dějiny civilizace. Kulturní dějiny, Ingriš, Eduard,, hudebníci, skladatelé, cestovatelé, fotografové, Československo 1918-1992, světové dějiny od r. 1918 do současnosti, and dějiny vědy, umění, kultury a techniky, kulturní vztahy
Language:: Portuguese
Rights:: unknown

77. El interés por el Brasil en la literatura checa y eslovaca entre las dos guerras mundiales /

Creator:: Binková, Simona,
Subject:: vztahy česko-brazilské, vztahy slovensko-brazilské, dějiny literatury, literatura slovenská, pohled na druhé, přehledná zpracování (tematicky), zahraniční politika, mezinárodní vztahy, světové dějiny od r. 1918 do současnosti, Brazílie, Československo 1918-1945, and literatura, spisovatelé
Language:: Portuguese
Rights:: unknown

78. Emigração alemã para o sul do Brasil no século XIX: propaganda e expectativas. Experiências de imigrantes no Rio Grande do Sul /

Creator:: Piccolo, Helga Iracema Landgraf,
Subject:: emigrace, vystěhovalectví, Němci, propaganda, světové dějiny 1789-1918, Německo, Brazílie, and migrace, vystěhovalectví, kolonizace
Language:: Portuguese
Rights:: unknown

79. Enrique Stanko Vráz 1860 - 20. 02. 1932, Praga (Réepública Tcheca). Viajante e fotógrafo tcheco, autor do livro Através de América Equatorial /

Creator:: Kázecký, Stanislav,
Type:: text and studie
Subject:: Dějiny civilizace. Kulturní dějiny, Vráz, Enrique Stanko,, cestovatelé, fotografové, etnografie, cestopisy, vztahy česko-jihoamerické, české země 1848-1918, světové dějiny 1789-1918, dějiny vědy, umění, kultury a techniky, kulturní vztahy, and Československo 1918-1938
Language:: Portuguese
Rights:: unknown

80. Entre duas eleições /

Creator:: Cunhal, Álvaro,
Type:: text and spisy
Subject:: Dějiny států a území na Pyrenejském poloostrově, Cunhal, Álvaro,, politici portugalští, projevy politické, strany politické, strany politické komunistické, Portugalsko, světové dějiny od r. 1945 do současnosti, and politické dějiny, politici
Language:: Portuguese
Rights:: unknown

81. Europarl QTLeap WSD/NED corpus

Creator:: Agirre, Eneko, Branco, António, Popel, Martin, and Simov, Kiril
Publisher:: University of the Basque Country, UPV/EHU, Faculty of Science, Univeristy of Lisbon, FCUL, Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL), and Bulgarian Academy of Sciences, IICT-BAS
Type:: text and corpus
Subject:: annotated corpus and multilingual
Language:: Basque, Bulgarian, Czech, English, Portuguese, and Spanish
Description:: This corpora is part of Deliverable 5.5 of the European Commission project QTLeap FP7-ICT-2013.4.1-610516 (http://qtleap.eu). The texts are sentences from the Europarl parallel corpus (Koehn, 2005). We selected the monolingual sentences from parallel corpora for the following pairs: Bulgarian-English, Czech-English, Portuguese-English and Spanish-English. The English corpus is comprised by the English side of the Spanish-English corpus. Basque is not in Europarl. In addition, it contains the Basque and English sides of the GNOME corpus. The texts have been automatically annotated with NLP tools, including Word Sense Disambiguation, Named Entity Disambiguation and Coreference resolution. Please check deliverable D5.6 in http://qtleap.eu/deliverables for more information.
Rights:: Creative Commons - Attribution 4.0 International (CC BY 4.0), http://creativecommons.org/licenses/by/4.0/, and PUB

82. Europarl: European Parliament Proceedings Parallel Corpus 1996-2003

Type:: corpus
Language:: Portuguese
Description:: Parallel corpus
Rights:: Not specified

83. European expansion 1494-1519 :

Type:: text and edice
Subject:: Geografie jako věda. Výzkum. Cestování, cesty objevné, rukopisy, pohled na druhé, cesty námořní, historická geografie, kartografie a topografie, světové dějiny 1492-1648, and dějiny vědy, umění, kultury a techniky, kulturní vztahy
Language:: English and Portuguese
Description:: Mapy na předsádkách and Frontispis
Rights:: unknown

84. Europeos en la Araucanía. Los colonos del Budi a principios del siglo 20 /

Creator:: Chávez, Jaine Flores
Subject:: vztahy Evropané-Indiáni, imigrace, světové dějiny 1789-1918, světové dějiny od r. 1918 do současnosti, and migrace, vystěhovalectví, kolonizace
Language:: Portuguese
Rights:: unknown

85. Faleceu o professor Jaromír Tláskal /

Creator:: Jindrová, Jaroslava,
Type:: nekrology
Subject:: Filologie, Tláskal, Jaromír,, filologové, romanisté, bibliografie personální, historici (jubilea, nekrology apod.), and personální bibliografie
Language:: Portuguese
Rights:: unknown

86. Formação do PCB 1922-1928 :

Creator:: Pereira, Astrojildo,
Type:: text, prameny, and dokumenty
Subject:: Politické strany a hnutí, strany politické komunistické, Brazílie, světové dějiny 1918-1945, and politické strany a hnutí, volby
Language:: Portuguese
Rights:: unknown

87. František Čech-Vyšata 14. 02. 1881, Chlumany (República Tcheca) - 03. 10. 1942, Sobíňov (República Tcheca). Escritor e viajante /

Creator:: Tkadlečková, Věra,
Type:: text and studie
Subject:: Dějiny civilizace. Kulturní dějiny, Čech-Vyšata, František,, cestovatelé, spisovatelé, cestopisy, vztahy česko-jihoamerické, české země 1848-1918, Československo 1918-1945, světové dějiny 1789-1918, světové dějiny 1918-1945, and dějiny vědy, umění, kultury a techniky, kulturní vztahy
Language:: Portuguese
Rights:: unknown

88. FreeLing

Publisher:: Centro de Tecnologías y Aplicaciones del Lenguaje y del Habla (TALP)
Type:: toolService
Language:: Catalan, English, Galician, Italian, Portuguese, and Welsh
Description:: Open source language analysis tool suite: tokenizer, stemmer/lemmatizer, named entity recognizer, chunker/segmenter, morphosyntactic tagger, syntactic tagger, corpus processer, morphological tagger, semantic tagger, analyzer, Word Sense Disambiguator.
Rights:: Not specified