Design and Analysis of News Category Predictor-Reference-Cited by-同舟云学术

Design and Analysis of News Category Predictor

Published:2020-10-26 Issue:5 Volume:10 Page:6380-6385
ISSN:1792-8036
Container-title:Engineering, Technology & Applied Science Research
language:
Short-container-title:Eng. Technol. Appl. Sci. Res.

Author:

Hussain A.,Ali G.,Akhtar F.,Khand Z. H.,Ali A.

Abstract

Recent technological advancements have changed significantly the way news is produced, consumed, and disseminated. Frequent and on-spot news reporting has been enabled, which smartphones can access anywhere and anytime. News categorization or classification can significantly help in its proper and timely dissemination. This study evaluates and compares news category predictors' performance based on four supervised machine learning models. We choose a standard dataset of British Broadcasting Corporation (BBC) news consisting of five categories: business, sports, technology, politics, and entertainment. Four multi-class news category predictors have been developed and trained on the same dataset: Naïve Bayes, Random Forest, K-Nearest Neighbors (KNN), and Support Vector Machine (SVM). Each category predictor's performance was evaluated by analyzing the confusion matrix and quantifying the test dataset's precision, recall, and overall accuracy. In the end, the performance of all category predictors was studied and compared. The results show that all category predictors have achieved satisfactory accuracy grades. However, the SVM model performed better than the four supervised learning models, categorizing news articles with 98.3% accuracy. In contrast, the lowest accuracy was obtained by the KNN model. However, the KNN model's performance can be enhanced by investigating the optimal number of neighbors (K) value.

Publisher

Engineering, Technology & Applied Science Research

Reference33 articles.

1. [1] A. A. Hakim, A. Erwin, K. I. Eng, M. Galinium, and W. Muliady, "Automated document classification for news article in Bahasa Indonesia based on term frequency inverse document frequency (TF-IDF) approach," in 2014 6th International Conference on Information Technology and Electrical Engineering (ICITEE), Yogyakarta, Indonesia, Oct. 2014.

2. [2] G. Mujtaba, L. Shuib, R. G. Raj, R. Rajandram, and K. Shaikh, "Prediction of cause of death from forensic autopsy reports using text classification techniques: A comparative study," Journal of Forensic and Legal Medicine, vol. 57, pp. 41-50, Jul. 2018.

3. [3] V. S. Padala, K. Gandhi, and D. V. Pushpalatha, "Machine learning: the new language for applications," IAES International Journal of Artificial Intelligence (IJ-AI), vol. 8, no. 4, pp. 411-421, Dec. 2019.

4. [4] F. Miao, P. Zhang, L. Jin, and H. Wu, "Chinese News Text Classification Based on Machine Learning Algorithm," in 2018 10th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC), Hangzhou, China, Aug. 2018, vol. 02, pp. 48-51.

5. [5] S. M. H. Dadgar, M. S. Araghi, and M. M. Farahani, "A novel text mining approach based on TF-IDF and Support Vector Machine for news classification," in 2016 IEEE International Conference on Engineering and Technology (ICETECH), Coimbatore, India, Mar. 2016, pp. 112-116.

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Semantic Similarity of Common Verbal Expressions in Older Adults through a Pre-Trained Model;Big Data and Cognitive Computing;2023-12-29

2. A Comparative Analysis of SVM, LSTM and CNN-RNN Models for the BBC News Classification;Innovations in Smart Cities Applications Volume 6;2023

3. Classification of Articles from Mass Media by Categories and Relevance of the Subject Area;Modeling and Analysis of Information Systems;2022-09-25

4. Analysis of Text Feature Extractors using Deep Learning on Fake News;Engineering, Technology & Applied Science Research;2021-04-11