Crawling Ajax-Based Web Applications through Dynamic Analysis of User Interface State Changes-Reference-Cited by-同舟云学术

Crawling Ajax-Based Web Applications through Dynamic Analysis of User Interface State Changes

Published:2012-03 Issue:1 Volume:6 Page:1-30
ISSN:1559-1131
Container-title:ACM Transactions on the Web
language:en
Short-container-title:ACM Trans. Web

Author:

Mesbah Ali¹,van Deursen Arie²,Lenselink Stefan²

Affiliation:

1. University of British Columbia

2. Delft University of Technology

Abstract

Using JavaScript and dynamic DOM manipulation on the client side of Web applications is becoming a widespread approach for achieving rich interactivity and responsiveness in modern Web applications. At the same time, such techniques---collectively known as Ajax ---shatter the concept of webpages with unique URLs, on which traditional Web crawlers are based. This article describes a novel technique for crawling Ajax -based applications through automatic dynamic analysis of user-interface-state changes in Web browsers. Our algorithm scans the DOM tree, spots candidate elements that are capable of changing the state, fires events on those candidate elements, and incrementally infers a state machine that models the various navigational paths and states within an Ajax application. This inferred model can be used in program comprehension and in analysis and testing of dynamic Web states, for instance, or for generating a static version of the application. In this article, we discuss our sequential and concurrent Ajax crawling algorithms. We present our open source tool called Crawljax , which implements the concepts and algorithms discussed in this article. Additionally, we report a number of empirical studies in which we apply our approach to a number of open-source and industrial Web applications and elaborate on the obtained results.

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Networks and Communications

Link

https://dl.acm.org/doi/pdf/10.1145/2109205.2109208

Reference37 articles.

1. Crawling Web Pages with Support for Client-Side Dynamism

2. Client-side deep Web data extraction

3. Adding Usability to Web Engineering Models and Tools

4. An adaptive crawler for locating hiddenwebentry points

5. Automated security testing of web widget interactions

Cited by 173 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Web application testing—Challenges and opportunities;Journal of Systems and Software;2025-01

2. Semantic Constraint Inference for Web Form Test Generation;Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis;2024-09-11

3. Effective, Platform-Independent GUI Testing via Image Embedding and Reinforcement Learning;ACM Transactions on Software Engineering and Methodology;2024-06-21

4. Dead or alive: Discovering server HTTP endpoints in both reachable and dead client-side code;Journal of Information Security and Applications;2024-05

5. Guess the State: Exploiting Determinism to Improve GUI Exploration Efficiency;IEEE Transactions on Software Engineering;2024-04