Affiliation:
1. University Ss Cyril and Methodius, Faculty of Computer Science and Engineering, Skopje, Macedonia
2. University Ss Cyril and Methodius, Faculty of Computer Science and Engineering Skopje, Macedonia
Abstract
This paper presents the creation of machine learning based systems for Part-of-speech tagging of Macedonian language. Four well-known PoS tagger systems implemented for English and Slavic languages: TnT, cyclic dependency network, guided learning framework for bidirectional sequence classification, and dynamic features induction were trained. Orwell?s novel ?1984? was manually tagged from the authors and it was used split into training and test set. After the training of the models, a comparison between the models was made. At the end, a POS tagger with an accuracy that reaches 97.5% was achieved, making it very appropriate for the future grammatical tagging of the National corpus of Macedonian language, which is currently in its initial stage. The Part-of-speech tagger that was create is published online and free to use.
Publisher
National Library of Serbia
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Syllable and Morpheme Segmentation of Macedonian Language;2023 46th MIPRO ICT and Electronics Convention (MIPRO);2023-05-22