1. The Arcade learning environment: an evaluation platform for general agents;Bellemare;Journal of Artificial Intelligence Research,2013
2. Bertsekas, D. P., & Ioffe, S. (1996). Temporal differences based policy iteration and applications in neuro-dynamic programming. Technical Report LIDS-P-2349, MIT.
3. How to lose at Tetris;Burgiel;Mathematical Gazette,1997
4. ImageNet: A large-scale hierarchical image database;Deng,2009
5. Expected energy-based restricted Boltzmann machine for classification;Elfwing;Neural Networks,2015