Affiliation:
1. Faculty of Information Science and Technology, Multimedia University, Melaka 75450, Malaysia
2. Faculty of Management, Multimedia University, Cyberjaya 63100, Malaysia
Abstract
Several key challenges are faced during sentiment analysis. One major problem is determining the sentiment of complex sentences, paragraphs, and text documents. A paragraph with multiple parts might have multiple sentiment values. Predicting the overall sentiment value for this paragraph will not produce all the information necessary for businesses and brands. Therefore, a paragraph with multiple sentences should be separated into simple sentences. With a simple sentence, it will be effective to extract all the possible sentiments. Therefore, to split a paragraph, that paragraph must be properly punctuated. Most social media texts are improperly punctuated, so separating the sentences may be challenging. This study proposes a punctuation-restoration algorithm using the transformer model approach. We evaluated different Bidirectional Encoder Representations from Transformers (BERT) models for our transformer encoding, in addition to the neural network used for evaluation. Based on our evaluation, the RobertaLarge with the bidirectional long short-term memory (LSTM) provided the best accuracy of 97% and 90% for restoring the punctuation on Amazon and Telekom data, respectively. Other evaluation criteria like precision, recall, and F1-score are also used.
Funder
Telekom Malaysia Research and Development Grant
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference32 articles.
1. Curran, T., Treiber, J., and Rosenblatt, M. (2018). Proceedings of the Northeast Business & Economics Association, Northeast Business & Economics Association.
2. Omnichannel businesses in the publishing and retailing industries: Synergies and tensions between coexisting online and offline business models;Wiener;Decis. Support Syst.,2018
3. Rahat, A.M., Kahir, A., and Masum, A.K.M. (2019, January 22–23). Comparison of Naive Bayes and SVM Algorithm based on Sentiment Analysis Using Review Dataset. Proceedings of the 2019 8th International Conference System Modeling and Advancement in Research Trends (SMART), Moradabad, India.
4. Comparison of stochastic and rule-based POS tagging on Malay online text;Anbananthen;Am. J. Appl. Sci.,2017
5. Woldemariam, Y. (2016, January 12–14). Sentiment analysis in a cross-media analysis framework; Sentiment analysis in a cross-media analysis framework. Proceedings of the 2016 IEEE International Conference on Big Data Analysis (ICBDA), Hangzhou, China.
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献