Topic Sentiment Analysis for Twitter Data in Indian Languages Using Composite Kernel SVM and Deep Learning

Author:

Maity Shuverthi1ORCID,Sarkar Kamal1ORCID

Affiliation:

1. Jadavpur University, Kolkata, West Bengal, India

Abstract

Sentiment analysis of public opinions on social networks, such as Twitter or Facebook, can provide us with valuable information, which has a wide range of applications. But the efficiency and accuracy of the automated methods for Twitter sentiment analysis are hindered by the special characteristics of the Twitter data. The Twitter data is generally noisy, high-dimensional, and it has complex syntactic and semantic structures. Sentiment analysis of Twitter data in Indian languages is more challenging because the data is multilingual and code-mixed. In this article, we propose various composite kernel functions, each of which is used with Support Vector Machines (SVM) for developing a model for topic sentiment analysis of Twitter data in Indian languages. Each composite kernel function is constructed by taking the weighted summation of multiple single kernel functions defined by us. In addition to our proposed composite kernel SVM method, we use several state-of-the-art deep learning classifiers for topic sentiment classification. Since any suitable Twitter dataset in Indian languages is not available for conducting our experiments, we have developed our own datasets by collecting tweets related to five different Twitter trending topics in India. To prove the robustness and generalization capability of the proposed models, they are also evaluated on the US airline Twitter dataset which is a publicly available benchmark English dataset. The empirical study exhibits that the proposed composite kernel SVM method is effective for the sentiment classification task. In the case of Indian language datasets, the proposed composite kernel SVM method achieves the highest average accuracy of 74% and the highest average F-score of 0.73. On the other hand, the deep learning-based method achieves the average accuracy and the average F-score of 71.31% and 0.70, respectively. In the case of the US airline Twitter dataset, the proposed composite kernel SVM method achieves the average accuracy of 83% and the average F-score of 0.82, which are higher than that of the deep learning-based method.

Funder

Department of Science and Technology

Government of India under the SERB scheme

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Exploring Multilingual Indian Twitter Sentiment Analysis: A Comparative Study;2023 14th International Conference on Computing Communication and Networking Technologies (ICCCNT);2023-07-06

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3