Affiliation:
1. University of Engineering and Technology, Lahore, Punjab, Pakistan
Abstract
National Happiness Index (NHI) is a national indicator of development that estimates the economic and social well-being of the nation's individuals. With the proliferation of the internet, people share a significant amount of data on social media websites. We can process the data with different sentiment analysis techniques to calculate the NHI. In the literature, different approaches have been used to calculate NHI, which include the lexicon-based approach and machine learning approach. All of these existing approaches are proposed to calculate NHI for the sentiments written in the English language. However, these methods fail for complex Roman Urdu tweets that contain more than two sub-opinions. There are three primary objectives of the research: (1) to investigate current sentiment analysis techniques are sufficient for the classification of complex Roman Urdu sentiments; (2) to propose rule-based classifier for the classification of Roman Urdu sentiments comprising multiple sub-opinions; (3) to calculate NHI using Roman Urdu sentiments. For this purpose, we proposed the discourse information extractor, the rule-based method (3-RBC), and the machine learning classifier. The experimental results show that 3-RBC is efficient for feature identification, and it is more statistically significant than the baseline classifiers. The 3-RBC has successfully increased the accuracy by 7% and precision by 8%, which provides evidence that the proposed technique significantly increased the calculation of NHI.
Publisher
Association for Computing Machinery (ACM)
Cited by
8 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献