Wikicorpus
- Autoři
- Boleda, Gemma
- Identifikátor
- http://hdl.handle.net/11372/LRT-1105
- URL projektu
- http://www.lsi.upc.edu/~nlp/wikicorpus/
- Datum vydání
- 2014-07-30
- Typ
- corpus
- Popis
- Trilingual corpus (Catalan, Spanish, English) that contains large portions of the Wikipedia (based on a 2006 dump) and has been automatically enriched with linguistic information. In its present version, it contains over 750 million words.
- Klíčová slova
- trilingual corpus