Skip to search
Skip to main content
Skip to first result
Search
Search Results
Publisher:
Academy of Sciences
Type:
corpus
Language:
Hungarian
Description:
BSI is a large-scale survey which provides reliable data on and analyses of the varieties of Hungarian spoken in Budapest.
Rights:
Not specified
Publisher:
Academy of Sciences
Format:
application/xml
Type:
corpus
Language:
Hungarian
Description:
Containing 27 million running words the Hungarian Historical Corpus provides a valuable basis for research on the history of words of Hungarian between the second half of the 18th century and 2000.
Rights:
Not specified
Publisher:
Academy of Sciences
Format:
application/xml
Type:
corpus
Subject:
synchronic corpus
Language:
Hungarian
Description:
Written general synchronic reference corpus; 190m tokens; POS annotated XML
Rights:
Not specified
Publisher:
Budapest University of Technology and Economics Media Research (BME MOKK)
Type:
corpus
Subject:
Web corpus
Language:
Hungarian
Description:
Monolingual written general; 700 million tokens; Segmentation, disambiguation
Rights:
Not specified
Publisher:
Academy of Sciences and Budapest University of Technology and Economics Media Research (BME MOKK)
Type:
corpus
Subject:
parallel corpus
Language:
English and Hungarian
Description:
Billingual written general; 2 million sentences
Rights:
CC
Publisher:
MTA-SZTE Research Group on Artificial Intelligence
Type:
corpus
Subject:
speech corpus
Language:
Hungarian
Description:
spoken, monolingual, manually segmented domain-specific corpus of numbers, 5857 recorded words
Rights:
Not specified
Publisher:
Department of Informatics, Human Language Technology Group, University of Szeged
Format:
application/xml
Type:
corpus
Subject:
monolingual corpus , annotated corpus , and POS annotation
Language:
Hungarian
Description:
written, monolingual, general, manually POS annotated reference corpus; 1,247,546 tokens; MSD tagset, XML (TEIxLite) files
Rights:
Not specified
Publisher:
Department of Informatics, Human Language Technology Group, University of Szeged
Format:
application/xml
Type:
corpus
Subject:
monolingual corpus , annotated corpus , and POS annotation
Language:
Hungarian
Description:
written, monolingual, general, manually POS annotated reference corpus; 1,459,288 tokens; MSD tagset, XML (TEI P4) files
Rights:
Not specified
Publisher:
Department of Informatics, Human Language Technology Group, University of Szeged
Format:
application/xml
Type:
corpus
Language:
Hungarian
Description:
82,000 sentences with shallow syntactic annotation (NP-level).
Rights:
Not specified
Publisher:
Department of Informatics, Human Language Technology Group, University of Szeged
Format:
application/xml
Type:
corpus
Language:
Hungarian
Description:
82,000 sentences with full syntactic annotation.
Rights:
Not specified