A practical guide to multi-objective reinforcement learning and planning

Author:

Hayes Conor F.ORCID,Rădulescu Roxana,Bargiacchi Eugenio,Källström Johan,Macfarlane Matthew,Reymond Mathieu,Verstraeten Timothy,Zintgraf Luisa M.,Dazeley Richard,Heintz Fredrik,Howley Enda,Irissappane Athirai A.,Mannion PatrickORCID,Nowé Ann,Ramos Gabriel,Restelli Marcello,Vamplew Peter,Roijers Diederik M.

Abstract

AbstractReal-world sequential decision-making tasks are generally complex, requiring trade-offs between multiple, often conflicting, objectives. Despite this, the majority of research in reinforcement learning and decision-theoretic planning either assumes only a single objective, or that multiple objectives can be adequately handled via a simple linear combination. Such approaches may oversimplify the underlying problem and hence produce suboptimal results. This paper serves as a guide to the application of multi-objective methods to difficult problems, and is aimed at researchers who are already familiar with single-objective reinforcement learning and planning methods who wish to adopt a multi-objective perspective on their research, as well as practitioners who encounter multi-objective decision problems in practice. It identifies the factors that may influence the nature of the desired solution, and illustrates by example how these influence the design of multi-objective decision-making systems for complex problems.

Funder

Vlaamse regering

National University Ireland, Galway

Publisher

Springer Science and Business Media LLC

Subject

Artificial Intelligence

Reference218 articles.

1. Abdelfattah, S., Merrick, K., & Hu, J. (2019). Intrinsically motivated hierarchical policy learning in multi-objective markov decision processes. IEEE Transactions on Cognitive and Developmental Systems.

2. Abdolmaleki, A., Huang, S., Hasenclever, L., Neunert, M., Song, F., Zambelli, M., Martins, M., Heess, N., Hadsell, R., & Riedmiller, M. (2020). A distributional view on multi-objective policy optimization. In: International Conference on Machine Learning, (pp. 11–22). PMLR.

3. Abdullah, M., Yatim, A., Tan, C., & Saidur, R. (2012). A review of maximum power point tracking algorithms for wind energy systems. Renewable and Sustainable Energy Reviews, 16(5), 3220–3227.

4. Abels, A., Roijers, D., Lenaerts, T., Nowé, A., & Steckelmacher, D. (2019). Dynamic weights in multi-objective deep reinforcement learning. In: International Conference on Machine Learning, (pp. 11–20). PMLR.

5. Aho, J., Buckspan, A., Laks, J., Fleming, P., Jeong, Y., Dunne, F., Churchfield, M., Pao, L., & Johnson, K. (2012). A tutorial of wind turbine control for supporting grid frequency through active power control. In: American Control Conference (ACC), pp. 3120—3131.

Cited by 101 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3