CLASSIFICATION OF SHORT POSSESSIVE CLITIC PRONOUN NYA IN MALAY TEXT TO SUPPORT ANAPHOR CANDIDATE DETERMINATION

Author:

Mohd Noor Noor Huzaimi@Karimah1,Mohd Noah Shahrul Azman2,Ab Aziz Mohd Juzaiddin2

Affiliation:

1. Faculty of Computing, Universiti Malaysia Pahang, Malaysia

2. Faculty of Information Science & Technology, Universiti Kebangsaan Malaysia, Malaysia

Abstract

Anaphor candidate determination is an important process in anaphora resolution (AR) systems. There are several types of anaphor, one of which is pronominal anaphor. Pronominal anaphor is an anaphor that involves pronouns. In some of the cases, certain pronouns can be used without referring to any situation or entity in a text, and this phenomenon is known as pleonastic. In the case of the Malay language, it usually occurs for the pronoun nya. The pleonastic that exists in every text causes a severe problem to the anaphora resolution systems. The process to determine the pleonastic nya is not the same as identifying the pleonastic ‘it’ in the English language, where the syntactic pattern could not be used because the structure of nya comes at the end of a word. As an alternative, semantic classes are used to identify the pleonastic itself and the anaphoric nya. In this paper, the automatic semantic tag was used to determine the type of nya, which at the same time could determine nya as an anaphor candidate. The new algorithms and MalayAR architecture were proposed. The results of the F-measure showed the detection of clitic nya as a separate word achieved a perfect 100% result. In comparison, the clitic nya as a pleonastic achieved 88%, clitic nya referring to humans achieved 94%, and clitic nya referring to non-humans achieved 63%. The results showed that the proposed algorithms were acceptable to solve the issue of the clitic nya as pleonastic, human referral as well as non-human referral.

Publisher

UUM Press, Universiti Utara Malaysia

Subject

General Mathematics,General Computer Science

Reference35 articles.

1. Antunes, J., Lins, R. D., Lima, R., Oliveira, H., Riss, M. & Simske, S. J. (2018). Automatic cohesive summarization with pronominal anaphora resolution. Computer Speech & Language, 52, 141–164. https://doi. org/10.1016/j.csl.2018.05.004

2. Aone, C., & Bennett, S. W. (1996). Applying machine learning to anaphora resolution. Berlin, Heidelberg: Springer. https://doi.org/10.1007/3-540- 60925-3_55

3. Asao, Y., Iida, R., Torisawa, K. (2018, May). Annotating zero anaphora for question answering. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation, Miyazaki, Japan (pp. 3523–3528). https://www.aclweb.org/anthology/L18-1556

4. Asmah, H. O. (2009). Nahu Melayu Mutakhir (5th ed.). Kuala Lumpur: Dewan Bahasa dan Pustaka.

5. Ayala, D., Hernandez, I., Ruiz, D., & Toro, M. (2019). TAPON: A two-phase machine learning approach for semantic labelling. Knowledge Based Systems, 163(1), 931–943.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3