Demo2Test: Transfer Testing of Agent in Competitive Environment with Failure Demonstrations

Author:

Chen Jianming1ORCID,Wang Yawen1ORCID,Wang Junjie1ORCID,Xie Xiaofei2ORCID,Wang Dandan1ORCID,Wang Qing1ORCID,Xu Fanjiang1ORCID

Affiliation:

1. Institute of Software, Chinese Academy of Sciences, China

2. Singapore Management University, Singapore

Abstract

The competitive game between agents exists in many critical applications, such as military unmanned aerial vehicles. It is urgent to test these agents to reduce the significant losses caused by their failures. Existing studies mainly are to construct a testing agent that competes with the target agent to induce its failures. These approaches usually focus on a single task, requiring much more time for multi-task testing. However, if the previously tested tasks (source tasks) and the task to be tested (target task) share similar agents or task objectives, the transferable knowledge in source tasks can potentially increase the effectiveness of testing in the target task. We propose Demo2Test for conducting transfer testing of agents in the competitive environment, i.e., leveraging the demonstrations of failure scenarios from the source task to boost the testing effectiveness in the target task. It trains a testing agent with demonstrations and incorporates the action perturbation at key states to balance the number of revealed failures and their diversity. We conduct experiments in the simulated robotics competitive environments of MuJoCo. The results indicate that Demo2Test outperforms the best-performing baseline with improvements ranging from 22.38% to 87.98%, and 12.69% to 60.98%, in terms of the number and diversity of discovered failure scenarios, respectively.

Publisher

Association for Computing Machinery (ACM)

Reference75 articles.

1. Transfer deep learning approach for detecting coronavirus disease in X-ray images;Al-Smadi Mohammed;International Journal of Electrical and Computer Engineering,2021

2. Testing, Validation, and Verification of Robotic and Autonomous Systems;Araujo Hugo;A Systematic Review. ACM Trans. Softw. Eng. Methodol. (TOSEM),2023

3. Trapit Bansal, Jakub Pachocki, Szymon Sidor, Ilya Sutskever, and Igor Mordatch. 2018. Emergent Complexity via Multi-Agent Competition. arXiv preprint arXiv:1710.03748 (2018).

4. Vahid Behzadan and Arslan Munir. 2017. Vulnerability of Deep Reinforcement Learning to Policy Induction Attacks. In Machine Learning and Data Mining in Pattern Recognition - 13th International Conference, MLDM 2017, New York, NY, USA, July 15-20, 2017, Proceedings (Lecture Notes in Computer Science, Vol. 10358). Springer, 262–275.

5. Lukas Berglund, Tim Grube, Gregory Gay, Francisco Gomes de Oliveira Neto, and Dimitrios Platis. 2023. Test Maintenance for Machine Learning Systems: A Case Study in the Automotive Industry. In 2023 IEEE Conference on Software Testing, Verification and Validation (ICST). 410–421.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3