1. A closer look at invalid action masking in policy gradient algorithms;huang,2020
2. Stochastic variational inference;hoffman;Journal of Machine Learning Research,2013
3. Stable baselines3;raffin,2019
4. Openai gym;brockman,2016
5. Pyro: Deep universal probabilistic programming;bingham;The Journal of Machine Learning Research,2019