An Improved Word Representation for Deep Learning Based NER in Indian Languages

Author:

A P AjeesORCID,K Manju,Mary Idicula Sumam

Abstract

Named Entity Recognition (NER) is the process of identifying the elementary units in a text document and classifying them into predefined categories such as person, location, organization and so forth. NER plays an important role in many Natural Language Processing applications like information retrieval, question answering, machine translation and so forth. Resolving the ambiguities of lexical items involved in a text document is a challenging task. NER in Indian languages is always a complex task due to their morphological richness and agglutinative nature. Even though different solutions were proposed for NER, it is still an unsolved problem. Traditional approaches to Named Entity Recognition were based on the application of hand-crafted features to classical machine learning techniques such as Hidden Markov Model (HMM), Support Vector Machine (SVM), Conditional Random Field (CRF) and so forth. But the introduction of deep learning techniques to the NER problem changed the scenario, where the state of art results have been achieved using deep learning architectures. In this paper, we address the problem of effective word representation for NER in Indian languages by capturing the syntactic, semantic and morphological information. We propose a deep learning based entity extraction system for Indian languages using a novel combined word representation, including character-level, word-level and affix-level embeddings. We have used ‘ARNEKT-IECSIL 2018’ shared data for training and testing. Our results highlight the improvement that we obtained over the existing pre-trained word representations.

Publisher

MDPI AG

Subject

Information Systems

Reference64 articles.

1. Survey of Named Entity Recognition Systems with respect to Indian and Foreign Languages

2. Named Entity Identifier for Malayalam Using Linguistic Principles Employing Statistical Methods;Bindu;Int. J. Comput. Sci. Issues,2011

3. Statistical Arabic Name Entity Recognition Approaches: A Survey

4. Semantic processing of multimedia data for e-government applications

Cited by 9 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Named Entity Recognition in Bengali and Hindi Using MuRIL and Conditional Random Fields;SN Computer Science;2024-09-03

2. Named Entity Recognition for Indic Languages: A Comprehensive Survey;2024 1st International Conference on Trends in Engineering Systems and Technologies (ICTEST);2024-04-11

3. Hybrid Model for Named Entity Recognition;International Journal of Distributed Artificial Intelligence;2022-10-07

4. Named entity recognition using neural language model and CRF for Hindi language;Computer Speech & Language;2022-07

5. Urdu Named Entity Recognition System Using Deep Learning Approaches;The Computer Journal;2022-04-23

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3