HILC-Reference-Cited by-同舟云学术

HILC

Published:2019-04-25 Issue:2-3 Volume:9 Page:1-27
ISSN:2160-6455
Container-title:ACM Transactions on Interactive Intelligent Systems
language:en
Short-container-title:ACM Trans. Interact. Intell. Syst.

Author:

Intharah Thanapong¹,Turmukhambetov Daniyar²,Brostow Gabriel J.²

Affiliation:

1. University College London and Khon Kaen University, Thailand

2. University College London, United Kingdom

Abstract

Creating automation scripts for tasks involving Graphical User Interface (GUI) interactions is hard. It is challenging because not all software applications allow access to a program’s internal state, nor do they all have accessibility APIs. Although much of the internal state is exposed to the user through the GUI, it is hard to programmatically operate the GUI’s widgets. To that end, we developed a system prototype that learns by demonstration, called HILC (Help, It Looks Confusing). Users, both programmers and non-programmers, train HILC to synthesize a task script by demonstrating the task. A demonstration produces the needed screenshots and their corresponding mouse-keyboard signals. After the demonstration, the user answers follow-up questions. We propose a user-in-the-loop framework that learns to generate scripts of actions performed on visible elements of graphical applications. Although pure programming by demonstration is still unrealistic due to a computer’s limited understanding of user intentions, we use quantitative and qualitative experiments to show that non-programming users are willing and effective at answering follow-up queries posed by our system, to help with confusing parts of the demonstrations. Our models of events and appearances are surprisingly simple but are combined effectively to cope with varying amounts of supervision. The best available baseline, Sikuli Slides, struggled to assist users in the majority of the tests in our user study experiments. The prototype with our proposed approach successfully helped users accomplish simple linear tasks, complicated tasks (monitoring, looping, and mixed), and tasks that span across multiple applications. Even when both systems could ultimately perform a task, ours was trained and refined by the user in less time.

Funder

Ministry of Science and Technology of Thailand Scholarship and EPSRC

Publisher

Association for Computing Machinery (ACM)

Subject

Artificial Intelligence,Human-Computer Interaction

Link

https://dl.acm.org/doi/pdf/10.1145/3234508

Reference36 articles.

1. Waken

Cited by 8 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. MIWA: Mixed-Initiative Web Automation for Better User Control and Confidence;Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology;2023-10-29

2. Task Automation Intelligent Agents: A Review;Future Internet;2023-05-29

3. Screen2Vec: Semantic Embedding of GUI Screens and GUI Components;Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems;2021-05-06

4. Demonstration + Natural Language: Multimodal Interfaces for GUI-Based Interactive Task Learning Agents;Human–Computer Interaction Series;2021

5. Privacy-Preserving Script Sharing in GUI-based Programming-by-Demonstration Systems;Proceedings of the ACM on Human-Computer Interaction;2020-05-28