1. Conservative Q-Learning for Offline Reinforcement Learning;kumar;Advances in Neural IInformation Processing Systems,2020
2. Scalable Methods for Computing State Similarity in Deterministic Markov Decision Processes
3. Off-Policy Deep Reinforcement Learning without Exploration;fujimoto;International Conference on Machine Learning (ICML),2019
4. Metrics for finite Markov decision processes;ferns;Uncertainty in Artificial Intelligence (UAI),2004