1. Do Better ImageNet Models Transfer Better?
2. The evolved transformer;so,0
3. Exploring self-attention for image recognition;zhao;Proceedings of the IEEE Conference on Conference on Computer Vision and Pattern Recognition,2021
4. Attention is all you need;vaswani;In Advances in Neural Information Processing Systems,2017