Intrinsically motivated reinforcement learning based recommendation with counterfactual data augmentation
-
Published:2023-07-15
Issue:5
Volume:26
Page:3253-3274
-
ISSN:1386-145X
-
Container-title:World Wide Web
-
language:en
-
Short-container-title:World Wide Web
Author:
Chen Xiaocong,Wang Siyu,Qi Lianyong,Li Yong,Yao Lina
Abstract
AbstractDeep reinforcement learning (DRL) has shown promising results in modeling dynamic user preferences in RS in recent literature. However, training a DRL agent in the sparse RS environment poses a significant challenge. This is because the agent must balance between exploring informative user-item interaction trajectories and using existing trajectories for policy learning, a known exploration and exploitation trade-off. This trade-off greatly affects the recommendation performance when the environment is sparse. In DRL-based RS, balancing exploration and exploitation is even more challenging as the agent needs to deeply explore informative trajectories and efficiently exploit them in the context of RS. To address this issue, we propose a novel intrinsically motivated reinforcement learning (IMRL) method that enhances the agent’s capability to explore informative interaction trajectories in the sparse environment. We further enrich these trajectories via an adaptive counterfactual augmentation strategy with a customised threshold to improve their efficiency in exploitation. Our approach is evaluated on six offline datasets and three online simulation platforms, demonstrating its superiority over existing state-of-the-art methods. The extensive experiments show that our IMRL method outperforms other methods in terms of recommendation performance in the sparse RS environment.
Funder
University of New South Wales
Publisher
Springer Science and Business Media LLC
Subject
Computer Networks and Communications,Hardware and Architecture,Software
Reference53 articles.
1. Chen, X., Yao, L., McAuley, J., Zhou, G., Wang, X.: Deep reinforcement learning in recommender systems: A survey and new perspectives. Knowl. Based Syst. 264, 110335 (2023) 2. Zheng, G., Zhang, F., Zheng, Z., Xiang, Y., Yuan, N.J., Xie, X., Li, Z.: Drn: A deep reinforcement learning framework for news recommendation. In: Proceedings of the 2018 World Wide Web Conference, 167–176 (2018) 3. Dulac-Arnold, G., Evans, R., van Hasselt, H., Sunehag, P., Lillicrap, T., Hunt, J., Mann, T., Weber, T., Degris, T., Coppin, B.: Deep reinforcement learning in large discrete action spaces. arXiv:1512.07679 (2015) 4. Xu, J., Wei, Z., Xia, L., Lan, Y., Yin, D., Cheng, X., Wen, J.-R.: Reinforcement learning to rank with pairwise policy gradient. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 509–518 (2020) 5. Degris, T., White, M., Sutton, R.S.: Off-policy actor-critic. arXiv:1205.4839 (2012)
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|