Reinforcement Learning with a Disentangled Universal Value Function for Item Recommendation-Reference-Cited by-同舟云学术

Reinforcement Learning with a Disentangled Universal Value Function for Item Recommendation

Published:2021-05-18 Issue:5 Volume:35 Page:4427-4435
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Wang Kai,Zou Zhene,Deng Qilin,Tao Jianrong,Wu Runze,Fan Changjie,Chen Liang,Cui Peng

Abstract

In recent years, there are great interests as well as many challenges in applying reinforcement learning (RL) to recommendation systems (RS). In this paper, we summarize three key practical challenges of large-scale RL-based recommender systems: massive state and action spaces, high-variance environment, and the unspecific reward setting in recommendation. All these problems remain largely unexplored in the existing literature and make the application of RL challenging. We develop a model-based reinforcement learning framework, called GoalRec. Inspired by the ideas of world model (model-based), value function estimation (model-free), and goal-based RL, a novel disentangled universal value function designed for item recommendation is proposed. It can generalize to various goals that the recommender may have, and disentangle the stochastic environmental dynamics and high-variance reward signals accordingly. As a part of the value function, free from the sparse and high-variance reward signals, a high-capacity reward-independent world model is trained to simulate complex environmental dynamics under a certain goal. Based on the predicted environmental dynamics, the disentangled universal value function is related to the user's future trajectory instead of a monolithic state and a scalar reward. We demonstrate the superiority of GoalRec over previous approaches in terms of the above three practical challenges in a series of simulations and a real application.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 11 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. HAIChart: Human and AI Paired Visualization System;Proceedings of the VLDB Endowment;2024-07

2. CDCM: ChatGPT-Aided Diversity-Aware Causal Model for Interactive Recommendation;IEEE Transactions on Multimedia;2024

3. Knowledge-Enhanced Causal Reinforcement Learning Model for Interactive Recommendation;IEEE Transactions on Multimedia;2024

4. Goal-Oriented Multi-Modal Interactive Recommendation with Verbal and Non-Verbal Relevance Feedback;Proceedings of the 17th ACM Conference on Recommender Systems;2023-09-14

5. Interpretability for reliable, efficient, and self-cognitive DNNs: From theories to applications;Neurocomputing;2023-08