Transfer Learning in Multi-Armed Bandits: A Causal Approach

Author:

Zhang Junzhe1,Bareinboim Elias1

Affiliation:

1. Purdue University, West Lafayette, IN

Abstract

Reinforcement learning (RL) agents have been deployed in complex environments where interactions are costly, and learning is usually slow. One prominent task in these settings is to reuse interactions performed by other agents to accelerate the learning process. Causal inference provides a family of methods to infer the effects of actions from a combination of data and qualitative assumptions about the underlying environment. Despite its success of transferring invariant knowledge across domains in the empirical sciences, causal inference has not been fully realized in the context of transfer learning in interactive domains. In this paper, we use causal inference as a basis to support a principled and more robust transfer of knowledge in RL settings. In particular, we tackle the problem of transferring knowledge across bandit agents in settings where causal effects cannot be identified by do-calculus [Pearl, 2000] and standard learning techniques. Our new identification strategy combines two steps -- first, deriving bounds over the arm’s distribution based on structural knowledge; second, incorporating these bounds in a dynamic allocation procedure so as to guide the search towards more promising actions. We formally prove that our strategy dominates previously known algorithms and achieves orders of magnitude faster convergence rates than these algorithms. Finally, we perform simulations and empirically demonstrate that our strategy is consistently more efficient than the current (non-causal) state-of-the-art methods

Publisher

International Joint Conferences on Artificial Intelligence Organization

Cited by 19 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Scientific Inference with Interpretable Machine Learning: Analyzing Models to Learn About Real-World Phenomena;Minds and Machines;2024-07-15

2. A systematic literature review of solutions for cold start problem;International Journal of System Assurance Engineering and Management;2024-05-14

3. ACAV: A Framework for Automatic Causality Analysis in Autonomous Vehicle Accident Recordings;Proceedings of the IEEE/ACM 46th International Conference on Software Engineering;2024-04-12

4. Subjective Causality;Revue économique;2023-11-20

5. Darwin: Flexible Learning-based CDN Caching;Proceedings of the ACM SIGCOMM 2023 Conference;2023-09

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3