Language: French / Rights: Not specified - LINDAT/CLARIAH-CZ Catalog Search Results

1. Base de synonymes CRISCO

Type:: lexicalConceptualResource
Language:: French
Description:: 49.000, RDB
Rights:: Not specified

2. Basic vocabulary on the Human Genome

Publisher:: Institut Universitari de Lingüística Aplicada, Universitat Pompeu Fabra
Type:: lexicalConceptualResource
Language:: Catalan, English, French, Galician, Italian, Portuguese, and Spanish
Description:: A vocabulary resulting from the cooperation of the groups of REALITER network that collects the basic terminology mostly used in texts about Genomics. It contains equivalents in English, Peninsular and Latinamerican Spanish, French, Italian, Galician, Portuguese and Catalan.
Rights:: Not specified

3. Botanicus Digital Library

Type:: corpus
Subject:: Germanistik
Language:: Chinese, Czech, English, French, German, Latin, and Spanish
Description:: Digital copies of historical botanic papers from the Missouri Botanical Garden Library; Bilddigitalisate von historischen botanischen Schriften; deutschsprachige Texte stellen nur einen Teilbereich dar
Rights:: Not specified

4. Corpus arborée du français

Type:: corpus
Language:: French
Description:: 800.000 words, POS and syntax, proprietary XML
Rights:: Not specified

5. Corpus CLUVI

Publisher:: TALG Research Group (University of Vigo)
Type:: corpus
Language:: Basque, Catalan, English, French, Galician, German, Portuguese, and Spanish
Description:: Parallel corpus, 22 million words
Rights:: Not specified

6. CorpusExplorer

Creator:: Rüdiger, Jan Oliver
Publisher:: Jan Oliver Rüdiger
Type:: tool and toolService
Subject:: Corpus Linguisitics, NLP, conll, tei, XML, nlp, Natural Language Processing, linguistics, Linguistics, Computational Linguistics, corpus processing, tagger, POS tagger, lemmatization, text cleaning, CommonCrawl, epub, JSON, Twitter, Pandoc, Wikipedia, digital data, DTA, DSpin, MySQL, ElasticSearch, TextGrid, text corpora, TigerXML, and WeblichtXML
Language:: German, English, French, Italian, Dutch, Spanish, Polish, Arabic, Chinese, and Portuguese
Description:: Software for corpus linguists and text/data mining enthusiasts. The CorpusExplorer combines over 45 interactive visualizations under a user-friendly interface. Routine tasks such as text acquisition, cleaning or tagging are completely automated. The simple interface supports the use in university teaching and leads users/students to fast and substantial results. The CorpusExplorer is open for many standards (XML, CSV, JSON, R, etc.) and also offers its own software development kit (SDK). Source code available at https://github.com/notesjor/corpusexplorer2.0
Rights:: Not specified

7. DicoValence

Type:: lexicalConceptualResource
Language:: French
Description:: 3700 entries, text
Rights:: Not specified

8. Dictionnaire de l'occitan médiéval (DOM)

Creator:: Claudia, Kraus, Stempel, Wolf-Dieter, Tausend, Monika, and Peter, Renate
Publisher:: Bavarian Academy of Sciences and Humanities and Bayerische Akademie der Wissenschaften
Type:: text, lexicon, and lexicalConceptualResource
Subject:: Emil Levy, Petit Levy, Lexique Roman, DOM, Occitian language, Medieval Occitan, Occitan, Old Occitan, Old Provençal, Romance languages, dictionary, etymology, Middle Ages, troubadours, lexicography, and Supplementwörterbuch
Language:: French and Old Provençal (to 1500)
Description:: In the Middle Ages, Old Occitan (formerly "Old Provençal"), the language of the troubadours, was a literary and cultural language, the influence of which extended far beyond the frontiers of Southern France. The only comprehensive portrayal of the Old Occitan vocabulary to have appeared up to now is the "Lexique roman" by François Raynouard (6 vols., 1836–1845). It was supplemented by Emil Levy’s "Provenzalisches Supplementwörterbuch" (8 vols., 1894–1924). An updated dictionary, taking account of progress in research over the last 100 years, has been the desideratum of literary scholars, linguists, and historians ever since. Under the direction of Wolf-Dieter Stempel, the publication of a new dictionary of Old Occitan, the "Dictionnaire de l'occitan médiéval (DOM)", began in 1996. This appeared in print until 2013, directed from 2012 on by Maria Selig. Since then it has been available as an alphabetically complete digital dictionary, the "DOM en ligne". This comprises the newly written articles of the DOM together with the articles from the dictionaries of Raynouard and Levy for those parts of the alphabet not yet covered by the new work and is enriched by entries for words absent till now from Old Occitan lexicography. Its content is available for free at https://dom-en-ligne.de/dom.php
Rights:: Not specified

9. Digitale Sammlungen der Universitäts- und Landesbibliothek Münster

Publisher:: Westfälische Wilhelms-Universität Münster
Type:: corpus
Subject:: Germanistik
Language:: French, German, and Latin
Description:: Digital copies of historical books and journals from the ULB Münster; collections from the region of Westphalia; Bilddigitalisate von Büchern und Zeitschriften aus dem historischen Bestand der ULB Münster sowie Sammlungen aus der Region Westfalen
Rights:: Not specified

10. DPC (Dutch Parallel Corpus)

Publisher:: Katholieke Universiteit Leuven Campus Kortrijk, Hogeschool Gent
Type:: corpus
Language:: Dutch, English, and French
Description:: Parallel corpus, with Dutch as first language, 10 M words (under construction). DPC is a STEVIN-project.
Rights:: Not specified

12. Frantext

Publisher:: ATILF
Type:: corpus
Language:: French
Description:: mainly literature (17th to 20th century)
Rights:: Not specified

13. French emblems at Glasgow

Publisher:: University of Glasgow
Type:: corpus
Language:: French
Description:: French emblem books (27 in total) of the 16th century, together with Latin versions where appropriate. Transcribed and facsimile versions, and extensive search functionality.
Rights:: Not specified

14. French learner language oral corpora

Publisher:: University of Southampton and Newcastle University
Type:: corpus
Language:: French
Description:: Seven French L2 corpora. Digital sound files and related transcripts formatted using CHILDES software. The database currently contains over 4000 files (sound files, transcripts and morphosyntactically tagged transcripts). .
Rights:: Not specified

15. French-Croatian Parallel Corpus

Type:: corpus
Language:: Croatian and French
Description:: written; domain-specific (fiction); diachronic (the French side); bilingual; parallel; ca 263,000 tokens (148 Kw French; 115 Kw Croatian); XML; S-alignment
Rights:: Not specified

16. JIRS

Publisher:: Grid and High Performance Computing Group, ITACA, Universidad Politécnica de Valencia and Universidad de Alicante
Type:: toolService
Language:: Arabic, English, French, Italian, Oromo, and Urdu
Description:: JIRS is a Passage Retrieval system specially suited for Question Answering. It could be adapted to others languages very easily. ask (Written Language): Information Retrieval Applications Question/Answering Environment: OS-independent Access: GPLv3
Rights:: Not specified

17. JRC-Acquis

Publisher:: Joint Research Centre of the EU
Type:: corpus
Language:: Bulgarian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Modern Greek (1453-), Hungarian, Italian, Latvian, Maltese, Norwegian, Polish, Portuguese, Romanian, Slovak, Slovenian, Spanish, and Swedish
Description:: The largest parallel corpus, contains EU law, the Acquis Communautaire in 22 languages.
Rights:: Not specified

18. KIAP - Cultural Identity in Academic Prose

Type:: corpus
Language:: English, French, and Norwegian
Description:: Comparable corpus, written, academic prose; 450 reviewed scientific papers; 3,2 million words; POS
Rights:: Not specified

19. Kicktionary

Type:: lexicalConceptualResource
Language:: English, French, and German
Description:: Electronic dictionary of football language, using FrameNet and WordNet approaches
Rights:: Not specified

20. L1 & L2 Acquisition Marzena Watorek French Project

Publisher:: Max Planck Institute for Psycholinguistics
Type:: corpus
Subject:: language acquisition corpus
Language:: French and Polish
Description:: Language Acquisition corpus
Rights:: Not specified

21. L1 Acquisition Gaby Cablitz

Publisher:: Max Planck Institute for Psycholinguistics
Type:: corpus
Language:: French
Description:: Language Acquisition corpus
Rights:: Not specified

22. L2 Acquisition Finiteness and Scope

Publisher:: Max Planck Institute for Psycholinguistics
Type:: corpus
Language:: Dutch, English, French, and German
Description:: Language Acquisition corpus
Rights:: Not specified

23. Lefff 2.0

Type:: lexicalConceptualResource
Language:: French
Description:: 100.000 entries, text
Rights:: Not specified

24. MEDIATIC

Publisher:: Katholieke Universiteit Leuven Campus Kortrijk, Université Lille3
Type:: corpus
Language:: Dutch and French
Description:: Databank with video-fragments (Dutch and French), transcribed and translated (LINGUATIC-project)
Rights:: Not specified

25. Moses Web Demo

Creator:: Bojar, Ondřej, Cífka, Ondřej, Pecina, Pavel, and Tamchyna, Aleš
Publisher:: Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
Type:: toolService and tool
Subject:: machine translation, web service, and demo
Language:: Czech, English, Russian, Ukrainian, French, and German
Description:: An interactive web demo of selected ÚFAL MT systems. and FP7-ICT-2011-7-288487 MosesCore
Rights:: Not specified

26. MPI ESF Corpus

Type:: corpus
Language:: Dutch, English, French, German, and Swedish
Description:: Corpus of the ESF Foreign Language Speakers project; almost perfect structurefor IEI; completely metadata described; lots of annotated audio recordings containing multimodal interaction;
Rights:: Not specified

27. Multilingualism Marianne Gullberg & Peter Indefrey

Publisher:: Max Planck Institute for Psycholinguistics
Type:: corpus
Language:: Dutch, German, English, and French
Description:: Language Acquisition corpus
Rights:: Not specified

28. MUSA Multilingual Multimodal Corpus

Type:: corpus
Language:: English, French, and Modern Greek (1453-)
Description:: Multilingual (EN, EL, FR); multimodal (Video, Text); parallel (EN, EL, FR subtitles); comparable (transcripts, subtitles); 120 hours
Rights:: Not specified

29. Namur Corpus

Publisher:: Katholieke Universiteit Leuven Campus Kortrijk
Type:: corpus
Language:: Dutch, English, and French
Description:: Trilingual parallel corpus, with Dutch as first language. 2M words, aligned at paragraph level. It includes fiction and non-fiction texts.
Rights:: Not specified

30. Neologismos económicos en las lenguas románicas a través de la prensa

Publisher:: Institut Universitari de Lingüística Aplicada, Universitat Pompeu Fabra
Type:: lexicalConceptualResource
Subject:: terminology database
Language:: Catalan, French, Galician, Italian, Portuguese, Romanian, and Spanish
Description:: Multilingual terminological resource containing 3.875 entries from the Economics, Finance and Banking domains.
Rights:: Not specified

31. PALIC

Publisher:: Institut Universitari de Lingüística Aplicada, Universitat Pompeu Fabra
Type:: toolService
Language:: Catalan, French, Portuguese, and Spanish
Description:: A package of tools for the processing of the Corpus Tècnic in Catalan and Spanish. It includes a preprocessor, a PoSTagger and a linguistic disambiguator.
Rights:: Not specified

32. Project Gutenberg

Type:: corpus
Language:: Danish, Dutch, English, Finnish, French, German, Italian, Latin, Portuguese, Russian, Spanish, Swedish, and Telugu
Description:: Possibility to download or to browse free electronic books; Angebot: Download von und Online-Zugang zu frei verfügbaren E-Books; deutschsprachige Literatur stellt nur einen Teilbereich der verfügbaren E-Books dar
Rights:: Not specified

33. SpeechDat-Car databases

Type:: corpus
Language:: Danish, Dutch, English, Finnish, French, German, Modern Greek (1453-), Italian, and Spanish
Description:: 9 speech databases for training and testing multilingual speech recognition applications in the car environment. Contains parallel 4 channel in-car recordings and a GSM channel. Contains interesting phonetically rich material. All orthographically transcribed. Speaker information included for gender, age, accent. Including pronunciation lexicon.
Rights:: Not specified

34. Speecon databases

Type:: corpus
Language:: Czech, Danish, Dutch, English, Finnish, French, German, Hungarian, Italian, Polish, Portuguese, Russian, Spanish, Swedish, Turkish, Chinese, Hebrew, Japanese, Korean, and Thai
Description:: 28 speech databases containing broadband recordings from 550 adults and 50 children per language. Contains interesting phonetically rich material. All orthographically transcribed. Speaker information included for gender, age, accent. Including pronunciation lexicon.
Rights:: Not specified

35. TeLeMaCo

Publisher:: Universität des Saarlandes
Type:: toolService
Subject:: documentation
Language:: Catalan, Dutch, English, French, German, and Italian
Description:: A collection of pointers to teaching and learning materials on linguistics and linguistic tools, including quick starts, how-tos, technical documentation, short teaching modules (2h), and full courses. This resource is collaboratively built by its users.
Rights:: Not specified

36. Termoteca

Publisher:: TALG Research Group (University of Vigo)
Type:: lexicalConceptualResource
Language:: English, French, Galician, and Spanish
Description:: Galician terminology databank, 6,000 terms
Rights:: Not specified

37. TermSciences

Type:: lexicalConceptualResource
Language:: French
Description:: 500.000 terms (fr, en, de, es), RDB / XML
Rights:: Not specified

38. The National Certificates corpus

Publisher:: Centre for Applied Language Studies, University of Jyväskylä
Type:: corpus
Language:: English, Finnish, French, German, Italian, Russian, Spanish, and Swedish
Description:: The NC test results, background information, speaking and writing performances in 9 foreign / second languages. A web-based data base (html files).
Rights:: Not specified

39. TreeTagger

Publisher:: University of Stuttgart
Type:: toolService
Subject:: POS tagger
Language:: Bulgarian, Dutch, English, French, German, Modern Greek (1453-), Italian, Portuguese, Russian, Spanish, and Swahili (macrolanguage)
Description:: A part-of-speech tagger and lemmatizer for several languages.
Rights:: Not specified

40. Wortschatz

Publisher:: University of Leipzig
Type:: corpus
Language:: Afrikaans, Albanian, Bulgarian, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Esperanto, Estonian, Finnish, French, German, Hungarian, Icelandic, Indonesian, Italian, Japanese, Korean, Latin, Latvian, Lithuanian, Malay (macrolanguage), Norwegian, Occitan (post 1500), Romanian, Russian, Slovak, Slovenian, Spanish, Sundanese, Swedish, Tagalog, Turkish, Vietnamese, and Welsh
Description:: Collected from newspaper texts, webcrawling, etc.: words (+frequency), cooccurrences (+graph), left/right neighbours, example sentences
Rights:: Not specified

Search

Search Constraints

Search Results

Limit your search

Contributor

Show values starting with

Coverage

Show values starting with

Creator

Format

Language

Show values starting with

Publisher

Show values starting with

Rights

Subject

Show values starting with

Type

Date

Original context has metadata only

Harvested from