Malicious Text Identification: Deep Learning from Public Comments and Emails-Reference-Cited by-同舟云学术

Malicious Text Identification: Deep Learning from Public Comments and Emails

Published:2020-06-10 Issue:6 Volume:11 Page:312
ISSN:2078-2489
Container-title:Information
language:en
Short-container-title:Information

Author:

Baccouche Asma^ORCID,Ahmed Sadaf,Sierra-Sosa Daniel^ORCID,Elmaghraby Adel^ORCID

Abstract

Identifying internet spam has been a challenging problem for decades. Several solutions have succeeded to detect spam comments in social media or fraudulent emails. However, an adequate strategy for filtering messages is difficult to achieve, as these messages resemble real communications. From the Natural Language Processing (NLP) perspective, Deep Learning models are a good alternative for classifying text after being preprocessed. In particular, Long Short-Term Memory (LSTM) networks are one of the models that perform well for the binary and multi-label text classification problems. In this paper, an approach merging two different data sources, one intended for Spam in social media posts and the other for Fraud classification in emails, is presented. We designed a multi-label LSTM model and trained it on the joint datasets including text with common bigrams, extracted from each independent dataset. The experiment results show that our proposed model is capable of identifying malicious text regardless of the source. The LSTM model trained with the merged dataset outperforms the models trained independently on each dataset.

Publisher

MDPI AG

Subject

Information Systems

Link

https://www.mdpi.com/2078-2489/11/6/312/pdf

Reference59 articles.

1. A survey of phishing attacks: Their types, vectors and technical approaches

2. Phishing attempts among the dark triad: Patterns of attack and vulnerability

3. Predicting susceptibility to social influence in phishing emails

4. Study on the effectiveness of anomaly detection for spam filtering

5. A keyword-based combination approach for detecting phishing webpages

Cited by 30 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. PSC-BERT: A spam identification and classification algorithm via prompt learning and spell check;Knowledge-Based Systems;2024-10

2. How Disinformation Affects Sales: Examining the Advertising Campaign of a Socially Responsible Brand;Journal of Business Research;2024-09

3. Effective text classification using BERT, MTM LSTM, and DT;Data & Knowledge Engineering;2024-05

4. Adaptive threshold optimisation for online feature selection using dynamic particle swarm optimisation in determining feature relevancy and redundancy;Applied Soft Computing;2024-05

5. Systematic Literature Review and Bibliometric Analysis on Addressing the Vanishing Gradient Issue in Deep Neural Networks for Text Data;Communications in Computer and Information Science;2024