Affiliation:
1. National Institute of Technology, Raipur, India
Abstract
Web robots are autonomous software agents used for crawling websites in a mechanized way for non-malicious and malicious reasons. With the popularity of Web 2.0 services, web robots are also proliferating and growing in sophistication. The web servers are flooded with access requests from web robots. The web access requests are recorded in the form of web server logs, which contains significant knowledge about web access patterns of visitors. The presence of web robot access requests in log repositories distorts the actual access patterns of human visitors. The human visitors' actual web access patterns are potentially useful for enhancement of services for more satisfaction or optimization of server resources. In this chapter, the correlative access patterns of human visitors and web robots are discussed using the web server access logs of a portal.
Reference24 articles.
1. Analyzing Web robots and their impact on caching.;V.Almeida;Proc. Sixth Workshop on Web Caching and Content Distribution,2001
2. Web server workload characterization
3. AWStats. (n.d.). A free logfile analyzer for advanced statistics (GNU GPL). Retrieved January 1, 2014, from http://awstats.sourceforge.net/
4. Self-similarity in World Wide Web traffic: evidence and possible causes
5. An investigation of web crawler behavior: characterization and metrics