1. Sigmoid-weighted linear units for neural network function approximation in reinforcement learning
2. Squeeze-and-excitation networks;hu;Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2018
3. Xception: Deep learning with depthwise separable convolutions;chollet;Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,2017
4. Batch normalization: Accelerating deep network training by reducing internal covariate shift;ioffe;International Conference on Machine Learning,2015
5. Attention is all you need;vaswani;Advances in neural information processing systems,2017