1. Do we need zero training loss after achieving zero training error? [C];ishida;37th Int Conf Mach Learn ICML 2020,2020
2. Deep Residual Learning for Image Recognition
3. ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks [C];lu;Conference on Neural Information Processing Systems,2019
4. Neural machine translation by jointly learning to align and translate [C];bahdanau;3rd Int Conf Learn Represent ICLR 2015-Conf Track Proc,2015