1. Random search for hyper‐parameter optimization;Bergstra J.;Journal of Machine Learning Research,2012
2. Boltyanskii V. G. Gamkrelidze R. V. &Pontryagin. (1960). Theory of optimal processes i: Maximum principle.News of Akad. Nauk SSSR. Mathematics Series 24 3‐42.
3. Optimization Methods for Large-Scale Machine Learning
4. Bryson A. E.(1961). A gradient method for optimizing multi‐stage allocation processes. InProceedings of Harvard University symposium on digital computers and their applications(Vol. 72).