Top-K Off-Policy Correction for a REINFORCE Recommender System-Reference-Cited by-同舟云学术

Top-K Off-Policy Correction for a REINFORCE Recommender System

Published:2019-01-30 Issue: Volume: Page:
ISSN:
Container-title:Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining
language:
Short-container-title:

Author:

Chen Minmin¹,Beutel Alex¹,Covington Paul¹,Jain Sagar¹,Belletti Francois¹,Chi Ed H.¹

Affiliation:

1. Google, Mountain View, CA, USA

Publisher

ACM

Link

https://dl.acm.org/doi/pdf/10.1145/3289600.3290999

Reference50 articles.

1. Joshua Achiam David Held Aviv Tamar and Pieter Abbeel. 2017. Constrained policy optimization. arXiv preprint arXiv:1705.10528 (2017). Joshua Achiam David Held Aviv Tamar and Pieter Abbeel. 2017. Constrained policy optimization. arXiv preprint arXiv:1705.10528 (2017).

2. Peter Auer Nicolo Cesa-Bianchi and Paul Fischer. 2002. Finite-time analysis of the multiarmed bandit problem. Machine learning Vol. 47 2--3 (2002) 235--256. 10.1023/A:1013689704352 Peter Auer Nicolo Cesa-Bianchi and Paul Fischer. 2002. Finite-time analysis of the multiarmed bandit problem. Machine learning Vol. 47 2--3 (2002) 235--256. 10.1023/A:1013689704352

3. Yoshua Bengio Jean-Sébastien Senécal et almbox. 2003. Quick Training of Probabilistic Neural Nets by Importance Sampling.. In AISTATS . 1--9. Yoshua Bengio Jean-Sébastien Senécal et almbox. 2003. Quick Training of Probabilistic Neural Nets by Importance Sampling.. In AISTATS . 1--9.

Cited by 229 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. UET4Rec: U-net encapsulated transformer for sequential recommender;Expert Systems with Applications;2024-12

2. Optimizing Novelty of Top-k Recommendations using Large Language Models and Reinforcement Learning;Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining;2024-08-24

3. Modeling User Retention through Generative Flow Networks;Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining;2024-08-24

4. Future Impact Decomposition in Request-level Recommendations;Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining;2024-08-24

5. On (Normalised) Discounted Cumulative Gain as an Off-Policy Evaluation Metric for Top- n Recommendation;Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining;2024-08-24