Files in this item

 Download all files in item (50.39 KB)
This item is
Publicly Available
and licensed under:
Creative Commons - Attribution 4.0 International (CC BY 4.0)
Distributed under Creative Commons Attribution Required
Icon
Name
README.txt
Size
1.07 KB
Format
Text file
Description
Documentation
MD5
924c1618bb557042e41dddfc91c4a165
 Download file  Preview
 File Preview  
AlbNER Named Entity Recognition in Albanian
===========================================

AlbNER is a Named Entity Recognition corpus of
Wikipedia sentences in Albanian, consisting of
900 records. The sentence tokens are manually
labeled complying with the CoNLL-2003 shared
task annotation scheme explained at
https://aclanthology.org/W03-0419.pdf that uses
I-ORG, B-ORG, I-PER, B-PER, I-LOC, B-LOC, I-MISC,
B-MISC and O tags. From the total of 900 records,
500 of them should be used for model training
(file train.txt), 100 for model developmen
(file dev.txt) and remaining 300 (file test.txt)
for model testing. 


License
-------

AlbNER corpus data are released under CC-BY license
(https://creativecommons.org/licenses/by/4.0/). 


Download
--------

AlbNER corpus can be download from:
http://hdl.handle.net/11234/1-5214


Publications
------------

If using AlbMoRe data, please cite the following paper:

Çano Erion. AlbNER: A Corpus for Named Entity Re . . .
                                            
Icon
Name
AlbNER.zip
Size
49.32 KB
Format
application/zip
Description
Data
MD5
b006ca2a7dc12e3b7132ec3f20be4f92
 Download file  Preview
 File Preview  
  • AlbNER
    • test.txt45 kB
    • dev.txt14 kB
    • train.txt76 kB