Affiliation:
1. Maulana Azad National Institute of Technology, Bhopal, India
2. JayPee University of Information Technology, Solan, India
Abstract
Web scraping is the technique exploited to robotically obtain particular information from web applications instead of manually copying it. The purpose of a web scraper is to search for certain class of information, dig out, and aggregate it into new database. More precisely, web scrapers are used to transform unstructured web data and store them in structured databases. It is a continuing threat to web applications that aims to steal sensitive data from a victim or from web applications. The key objective of this article is to examine to what extent web scraping can cause a threat to web application security. This article explores the classification of web scraping such as content scraping, web scraping, price scraping, and database scraping in general and presents the most widely used scraping tools such as Web Content Extractor, and Screen Scrapper. Consequently, the aim of this article is to give evaluation of vulnerabilities, threats of web scraping associated with web application applications, and effective measures to counter them.
Reference22 articles.
1. A FRAME WORK FOR WEB INFORMATION EXTRACTION AND ANALYSIS
2. Deep Web Crawler: A Review.;S.Agrawal;International Journal of Innovative Research in Computer Science & Technology,2013
3. Cheng, F., & Evans, E. (2012). U.S. Patent Application No. 13/447,986. Washington, DC: US Patent Office.
4. Key differences between Web 1.0 and Web 2.0
5. The use of web-scraping software in searching for grey literature.;N. R.Haddaway;Grey J,2015
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献