1. Pre-Trained Image Processing Transformer
2. Dropout as a bayesian approximation: Representing model uncertainty in deep learning;gal;International Conference on Machine Learning,2016
3. Image super-resolution using very deep residual channel attention networks;zhang;Proceedings of the European Conference on Computer Vision (ECCV),2018
4. Visualizing the loss landscape of neural nets;li;Advances in neural information processing systems,2018
5. Attention is all you need;vaswani;Advances in neural information processing systems,2017