1. Agrawal A, Lu J, Antol S, Mitchell M, Zitnick L, Batra D and Parikh D (2015) VQA: visual question answering, Proceedings of the IEEE international conference on computer vision, pp. 2425–2433.
2. Li P, Sun X, Yu H, Tian Y, Yao F (2022) Entity-oriented multi-modal alignment and fusion network for fake news detection. IEEE Trans Multimedia 24:3455–3468
3. Jin Z, Cao J, Guo H, Luo J and Zhang Y (2017), Multimodal fusion with recurrent neural networks for rumor detection on microblogs, Proceedings of the 25th ACM international conference on Multimedia., pp. 795–816
4. Khattar D, Goud J, Gupta M and Varma V (2019), MVAE: multimodal Variational Autoencoder for Fake News detection, The world wide web conference, pp. 2915–2921.
5. Singhal S, Kabra A, Sharma M, Shah RR, Chakraborty T, Kumaraguru P (2020) SpotFake+: a multimodal framework for fake news detection via transfer learning (student abstract). Proc AAAI Conf Artif Intell 34(10):13915–13916