"Deep reinforcement learning for search, recommendation, and online advertising: a survey" by Xiangyu Zhao, Long Xia, Jiliang Tang, and Dawei Yin with Martin Vesely as coordinator-Reference-Cited by-同舟云学术

"Deep reinforcement learning for search, recommendation, and online advertising: a survey" by Xiangyu Zhao, Long Xia, Jiliang Tang, and Dawei Yin with Martin Vesely as coordinator

Published:2019-07-29 Issue:Spring Volume: Page:1-15
ISSN:1931-1745
Container-title:ACM SIGWEB Newsletter
language:en
Short-container-title:SIGWEB Newsl.

Author:

Zhao Xiangyu¹,Xia Long²,Tang Jiliang¹,Yin Dawei²

Affiliation:

1. Michigan State University

2. JD.com

Abstract

Search, recommendation, and online advertising are the three most important information-providing mechanisms on the web. These information seeking techniques, satisfying users' information needs by suggesting users personalized objects (information or services) at the appropriate time and place, play a crucial role in mitigating the information overload problem. With recent great advances in deep reinforcement learning (DRL), there have been increasing interests in developing DRL based information seeking techniques. These DRL based techniques have two key advantages - (1) they are able to continuously update information seeking strategies according to users' real-time feedback, and (2) they can maximize the expected cumulative long-term reward from users where reward has different definitions according to information seeking applications such as click-through rate, revenue, user satisfaction and engagement. In this paper, we give an overview of deep reinforcement learning for search, recommendation, and online advertising from methodologies to applications, review representative algorithms, and discuss some appealing research directions.

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/3320496.3320500

Reference100 articles.

1. Optimal control of Markov processes with incomplete state information

2. The Nonstochastic Multiarmed Bandit Problem

3. Bellman R. 2013. Dynamic programming. Courier Corporation. Bellman R. 2013. Dynamic programming. Courier Corporation.

Cited by 34 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Debiased Recommendation with Noisy Feedback;Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining;2024-08-24

2. M ³ Rec: A Context-Aware Offline Meta-Level Model-Based Reinforcement Learning Approach for Cold-Start Recommendation;ACM Transactions on Information Systems;2024-08-19

3. Research on heterogeneous multi-UAV collaborative decision-making method based on improved PPO;Applied Intelligence;2024-07-29

4. Multi-Objective Contextual Bandits in Recommendation Systems for Smart Tourism;2024-05-29

5. UISA: User Information Separating Architecture for Commodity Recommendation Policy with Deep Reinforcement Learning;ACM Transactions on Recommender Systems;2024-04-06