Author:
Sharma Shivam,Agarwal Siddhant,Suresh Tharun,Nakov Preslav,Akhtar Md. Shad,Chakraborty Tanmoy
Abstract
Memes are powerful means for effective communication on social media. Their effortless amalgamation of viral visuals and compelling messages can have far-reaching implications with proper marketing. Previous research on memes has primarily focused on characterizing their affective spectrum and detecting whether the meme's message insinuates any intended harm, such as hate, offense, racism, etc. However, memes often use abstraction, which can be elusive. Here, we introduce a novel task - EXCLAIM, generating explanations for visual semantic role labeling in memes. To this end, we curate ExHVV, a novel dataset that offers natural language explanations of connotative roles for three types of entities - heroes, villains, and victims, encompassing 4,680 entities present in 3K memes. We also benchmark ExHVV with several strong unimodal and multimodal baselines. Moreover, we posit LUMEN, a novel multimodal, multi-task learning framework that endeavors to address EXCLAIM optimally by jointly learning to predict the correct semantic roles and correspondingly to generate suitable natural language explanations. LUMEN distinctly outperforms the best baseline across 18 standard natural language generation evaluation metrics. Our systematic evaluation and analyses demonstrate that characteristic multimodal cues required for adjudicating semantic roles are also helpful for generating suitable explanations.
Publisher
Association for the Advancement of Artificial Intelligence (AAAI)
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Understanding (Dark) Humour with Internet Meme Analysis;Companion Proceedings of the ACM Web Conference 2024;2024-05-13
2. Online Fake News Opinion Spread and Belief Change: A Systematic Review;Human Behavior and Emerging Technologies;2024-04-30
3. Los memes como simbolos del discurso de odio;VISUAL REVIEW. International Visual Culture Review / Revista Internacional de Cultura Visual;2024-04-10
4. A Multi-Modal Framework for Fake News Analysis for Detection to Deconstruction;2024 2nd International Conference on Disruptive Technologies (ICDT);2024-03-15