1. An image is worth 16× 16 words: Transformers for image recognition at scale;dosovitskiy;International Conference on Learning Representations (ICLR),0
2. Data augmentation as feature manipulation: a story of desert cows and grass cows;shen;International Conference on Machine Learning (ICML),0
3. Deep Domain Generalization With Structured Low-Rank Constraint
4. Generalizing across domains via cross-gradient training;shankar;ArXiv Preprint,2018
5. Computing nonvacuous generalization bounds for deep (stochastic) neural networks with many more parameters than training data;karolina dziugaite;Uncertainty in Artificial Intelligence (UAI),2016