Affiliation:
1. Computer Engineering, Sinhgad Institute of Technology and Science, Lonavala, India. E-mail: vikasskadam@gmail.com
2. Electronics & Telecommunication Department, Sinhgad Institute of Technology and Science, Narhe, Pune, India
Abstract
Email has sustained to be an essential part of our lives and as a means for better communication on the internet. The challenge pertains to the spam emails residing a large amount of space and bandwidth. The defect of state-of-the-art spam filtering methods like misclassification of genuine emails as spam (false positives) is the rising challenge to the internet world. Depending on the classification techniques, literature provides various algorithms for the classification of email spam. This paper tactics to develop a novel spam detection model for improved cybersecurity. The proposed model involves several phases like dataset acquisition, feature extraction, optimal feature selection, and detection. Initially, the benchmark dataset of email is collected that involves both text and image datasets. Next, the feature extraction is performed using two sets of features like text features and visual features. In the text features, Term Frequency-Inverse Document Frequency (TF-IDF) is extracted. For the visual features, color correlogram and Gray-Level Co-occurrence Matrix (GLCM) are determined. Since the length of the extracted feature vector seems to the long, the optimal feature selection process is done. The optimal feature selection is performed by a new meta-heuristic algorithm called Fitness Oriented Levy Improvement-based Dragonfly Algorithm (FLI-DA). Once the optimal features are selected, the detection is performed by the hybrid learning technique that is composed of two deep learning approaches named Recurrent Neural Network (RNN) and Convolutional Neural Network (CNN). For improving the performance of existing deep learning approaches, the number of hidden neurons of RNN and CNN is optimized by the same FLI-DA. Finally, the optimized hybrid learning technique having CNN and RNN classifies the data into spam and ham. The experimental outcomes show the ability of the proposed method to perform the spam email classification based on improved deep learning.
Subject
Computer Networks and Communications,Hardware and Architecture,Safety, Risk, Reliability and Quality,Software
Reference38 articles.
1. Hybrid water cycle optimization algorithm with simulated annealing for spam E-mail detection;Al-Rawashdeh;IEEE Access,2019
2. Evolving support vector machines using whale optimization algorithm for spam profiles detection on online social networks in different lingual contexts;Al-Zoubi;Knowledge-Based Systems,2018
3. Clustering and classication of email contents;Alsmadi;J. King Saud Univ.-Comput. Inf. Sci.,2015
4. A distributed approximate nearest neighbors algorithm for efficient large scale mean shift clustering;Beck;Journal of Parallel and Distributed Computing,2019
5. A novel clustering approach and adaptive SVM classifier for intrusion detection in WSN: A data mining concept;Borkar;Sustainable Computing: Informatics and Systems,2019
Cited by
8 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献