1. D. Amodei C. Olah J. Steinhardt P. Christiano J. Schulman D. Mané Concrete problems in AI safety. arXiv:1606.06565 [cs.AI] (2016).
2. T. Arnold D. Kasenberg M. Scheutz Value Alignment or Misalignment—What Will Keep Systems Accountable? AAAI Workshops (2017).
3. A. Y. Ng D. Harada S. J. Russell Policy invariance under reward transformations: Theory and application to reward shaping in Proceedings of the Sixteenth International Conference on Machine Learning (ICML 1999) pp. 278–287.
4. P. F. Christiano M. Abate D. Amodei Supervising strong learners by amplifying weak experts. arXiv:1810.08575 [cs.LG] (2018).
5. D. Hadfield-Menell S. J. Russell P. Abbeel A. Dragan Cooperative inverse reinforcement learning in Proceedings of Advances in Neural Information Processing Systems (NeurIPS 2016) pp. 3909–3917.