1. Agostinelli, F., Hoffman, M., Sadowski, P., Baldi, P., Learning activation functions to improve deep neural networks. arXiv:1412.6830.
2. Practical recommendations for gradient-based training of deep architectures;Bengio,2012
3. Canziani, A., Paszke, A., Culurciello, E., An analysis of deep neural network models for practical applications. arXiv:1605.07678.
4. Choromanska, A., Henaff, M., Mathieu, M., Ben Arous, G., LeCun, Y., The loss surfaces of multilayer networks. arXiv:1412.0233.
5. Clevert, D.-A., Unterthiner, T., Hochreiter, S., Fast and accurate deep network learning by Exponential Linear Units (ELUs). arXiv:1511.07289.