Web architecture for URL-based phishing detection based on Random Forest, Classification Trees, and Support Vector Machine-Reference-Cited by-同舟云学术

Web architecture for URL-based phishing detection based on Random Forest, Classification Trees, and Support Vector Machine

Published:2022-05-09 Issue:69 Volume:25 Page:107-121
ISSN:1988-3064
Container-title:Inteligencia Artificial
language:
Short-container-title:ia

Author:

Lamas Piñeiro Julio,Wong Portillo Lenis

Abstract

Nowadays phishing is as serious a problem as any other, but it has intensified a lot in the current coronavirus pandemic, a time when more than ever we all use the Internet even to make payments daily. In this context, tools have been developed to detect phishing, there are quite complex tools in a computational calculation, and they are not so easy to use for any user. Therefore, in this work, we propose a web architecture based on 3 machine learning models to predict whether a web address has phishing or not based mainly on Random Forest, Classification Trees, and Support Vector Machine. Therefore, 3 different models are developed with each of the indicated techniques and 2 models based on the models, which are applied to web addresses previously processed by a feature retrieval module. All this is deployed in an API that is consumed by a Frontend so that any user can use it and choose which type of model he/she wants to predict with. The results reveal that the best performing model when predicting both results is the Classification Trees model obtaining precision and accuracy of 80%. En la actualidad el phishing es un problema tan serio como cualquier otro, pero se ha intensificado bastante en la actual pandemia del coronavirus, un momento en el que más que nunca todos utilizamos internet hasta para realizar pagos cotidianamente. En este contexto se han desarrollado herramientas para detectar phishing, existen herramientas bastante complejas en calculo computacional y que no son de tan sencilla utilización para cualquier usuario. Por ende, en este trabajo proponemos una arquitectura web basada en 3 modelos de aprendizaje automático para predecir si una dirección web tiene phishing o no basados principalmente en Random Forest, Classification Trees y Support Vector Machine. Por lo tanto, se desarrollan 3 modelos distintos con cada una de las técnicas indicadas y 2 modelos basados en los anteriormente mencionados modelos, los cuales son aplicados a direcciones web previamente procesadas por un módulo de obtención de características. Todo ello se despliega en un API la cual es consumida por un Frontend para que cualquier usuario lo pueda utilizar y escoger con qué tipo de modelo quiere predecir. Los resultados revelan que el modelo que mejor se comporta al momento de predecir ambos resultados es el modelo de Árboles de clasificación obteniendo una precisión y exactitud de 80%.

Publisher

IBERAMIA: Sociedad Iberoamericana de Inteligencia Artificial

Subject

Artificial Intelligence,Software

Cited by 6 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. DDoS Attacks Detection based on Machine Learning Algorithms in IoT Environments;Inteligencia Artificial;2024-07-11

2. Combating Phishing in the Age of Fake News: A Novel Approach with Text-to-Text Transfer Transformer;Proceedings of the 1st Workshop on Security-Centric Strategies for Combating Information Disorder;2024-07

3. Detect malicious websites by building a neural network to capture global and local features of websites;Computers & Security;2024-02

4. Correlation n-ptychs of Multidimensional Datasets;Lecture Notes in Networks and Systems;2024

5. BERT-Based Approaches to Identifying Malicious URLs;Sensors;2023-10-16