1. Learning multi-view deep features for small object retrieval in surveillance scenarios;Guo,2015
2. Convolutional two-stream network fusion for video action recognition;Feichtenhofer,2016
3. Deep multi-view representation learning for video anomaly detection using spatiotemporal autoencoders;Deepak;Circuits, Systems, and Signal Processing,2020
4. Multimodal learning with deep boltzmann machines;Srivastava;Journal of Machince Learning Research (JMLR),2014
5. Deep captioning with multimodal recurrent neural networks (m-rnn);Mao,2015