Author:
Menshchikov A A,Komarova A V,Gatchin Y A,Kalinkina M E,Tkalich V L,Pirozhnikova O I
Abstract
Abstract
In this paper, we present a study of web-crawlers behavior depending on the web resource. We provide a simulation model of web-crawler which is necessary for web robots detection techniques and dataset generation for methods based on machine learning. We analyze differences of behavior among humans, common crawlers, malicious crawlers and demonstrate that their models can be used for behavior analysis. We show that malicious crawlers behave similar to common crawlers and their behavior can be simulated to obtain necessary datasets and traffic patterns for the further detection and protection against unethical crawling. Our results and observations can be used as a basis of comprehensive intrusion detection and prevention system development.
Subject
General Physics and Astronomy
Reference29 articles.
1. A soft computing approach for benign and malicious web robot detection;Zabihimayvan;Expert Systems with Applications,2017
2. A study of different web-crawler behaviour;Menshchikov,2017
3. The ethicality of web crawlers;Sun,2010
4. Contrasting Web Robot and Human Behaviors with Network Models;Brown,2018
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Crawl Smart: A Domain-Specific Crawler;Lecture Notes in Electrical Engineering;2023-11-30
2. Design and Research of Distributed Web Crawler Based on Knowledge Graph;2022 International Conference on 3D Immersion, Interaction and Multi-sensory Experiences (ICDIIME);2022-06