Author:
V. Ashwinkumar,Arage Prajwal Pramod,R. Jeya,Sudhakaran Pradeep
Abstract
Inthispaper,weconductanempiricalstudytocompare the performance of two popular approaches for multi-label text classification, which is a challenging task in naturallanguageprocessingthatrequirespredictingmultiplelabelsforagiventext:One-vs-RestandVotingclassifiers.Weevaluatetheseclassifiersonadatasetoftoxiccommentsandmeasuretheir performance using accuracy and hamming loss evaluationmetrics.OurexperimentalresultsshowthattheOne-vs-RestclassifierwithXGBoutperformstheVotingclassifierandachievesanaccuracyof91.7%.Thestudy’sresultscanbeusedasabenchmarkforfutureresearchinthisarea,andtheinsightsgained can be used to improve the accuracy and robustness ofmulti-label text classification models. Furthermore, our findingssuggest that the One-vs-Rest classifier with XGB is a promisingapproachformulti-labeltextclassificationtasks,whichcanprovidebetterresultsthanotherpopularclassifiers
Reference13 articles.
1. Katakis Ioannis, Tsoumakas Grigorios, and Vlahavas Ioannis. ”Multi-label text classification for automated tag suggestion.” ECML PKDDdiscoverychallenge 75(2008):2008.
2. AReviewofStandardTextClassificationPracticesforMulti-labelToxicity Identification of Online Content (Gunasekara &Nejadgholi, ALW 2018)
3. NamJ,KimJ,LozaMenc´ıaE,GurevychI,Fu¨rnkranzJ(2014)Large- scale multi-label text classification-revisiting neural networks. In: JointEuropean conference on machine learning and knowledge discovery indatabases.Springer,pp437–452
4. Fiallos A. and Jimenes K., ”Using Reddit Data for Multi-Label TextClassification of Twitter Users Interests,” 2019 Sixth International Con-ference on eDemocracy& eGovernment (ICEDEG), Quito, Ecuador, 2019, pp.324–327
5. Zaheri Sara; Leath Jeff; and Stroud David (2020) ”Toxic CommentClassification,”SMUDataScienceReview:Vol.3:No.1,Article13.