Skip to search
Skip to main content
Catalog
Repository
Education
Projects
Tools
Services
About
Partners
Mission Statement
CLARIN
DARIAH
Service integrations
Project partnerships
Search in
All Fields
Display Name
search for
Search
Search
Wikicorpus
Title:
Wikicorpus
Contributor:
Boleda, Gemma
Publisher:
Centro de Tecnologías y Aplicaciones del Lenguaje y del Habla (TALP)
Identifier:
http://hdl.handle.net/11372/LRT-1105
Fa Cc By 4.0 International/External Link Alt Solid
Subject:
trilingual corpus
Type:
corpus
Description:
Trilingual corpus (Catalan, Spanish, English) that contains large portions of the Wikipedia (based on a 2006 dump) and has been automatically enriched with linguistic information. In its present version, it contains over 750 million words.
Language:
Catalan
,
English
, and
Spanish
Rights:
Not specified
Coverage:
Spain
Source:
http://www.lsi.upc.edu/~nlp/wikicorpus/
Fa Cc By 4.0 International/External Link Alt Solid
Harvested from:
LINDAT/CLARIAH-CZ repository
Metadata only:
true
Date:
2014-07-30
The item or associated files might be "in copyright"; review the provided rights metadata:
Not specified
and the original context.
Original context
Fa Cc By 4.0 International/Home Solid
Show original
Fa Cc By 4.0 International/Eye Solid
Show harvested metadata