1. Imagenet classification with deep convolutional neural networks;Krizhevsky,2012
2. P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, Y. Lecun, Overfeat: integrated recognition, localization and detection using convolutional networks, arXiv:1312.6229 (2013).
3. Fully convolutional networks for semantic segmentation;Long,2015
4. Beyond bilinear: generalized multimodal factorized high-order pooling for visual question answering;Yu;IEEE Trans. Neural Netw. Learn. Syst.,2018
5. Multimodal transformer with multi-view visual representation for image captioning;Yu;IEEE Trans. Circuit. Syst. Video Technol.,2019