Formalizing Multimedia Recommendation through Multimodal Deep Learning

Author:

Malitesta Daniele1ORCID,Cornacchia Giandomenico2ORCID,Pomo Claudio3ORCID,Merra Felice Antonio4ORCID,Di Noia Tommaso5ORCID,Di Sciascio Eugenio5ORCID

Affiliation:

1. CentraleSupélec, Université Paris-Saclay, Gif-sur-Yvette, France

2. IBM Research Europe - Ireland, Dublin, Ireland

3. Politecnico di Bari, Bari Italy

4. Amazon Science, Berlin Germany

5. Dipartimento di Ingegneria Elettrica e dell'Informazione, Politecnico di Bari, Bari Italy

Abstract

Recommender systems (RSs) provide customers with a personalized navigation experience within the vast catalogs of products and services offered on popular online platforms. Despite the substantial success of traditional RSs, recommendation remains a highly challenging task, especially in specific scenarios and domains. For example, human affinity for items described through multimedia content (e.g., images, audio, and text), such as fashion products, movies, and music, is multi-faceted and primarily driven by their diverse characteristics. Therefore, by leveraging all available signals in such scenarios, multimodality enables us to tap into richer information sources and construct more refined user/item profiles for recommendations. Despite the growing number of multimodal techniques proposed for multimedia recommendation, the existing literature lacks a shared and universal schema for modeling and solving the recommendation problem through the lens of multimodality. Given the recent advances in multimodal deep learning for other tasks and scenarios where precise theoretical and applicative procedures exist, we also consider it imperative to formalize a general multimodal schema for multimedia recommendation. In this work, we first provide a comprehensive literature review of multimodal approaches for multimedia recommendation from the last eight years. Second, we outline the theoretical foundations of a multimodal pipeline for multimedia recommendation by identifying and formally organizing recurring solutions/patterns; at the same time, we demonstrate its rationale by conceptually applying it to selected state-of-the-art approaches in multimedia recommendation. Third, we conduct a benchmarking analysis of recent algorithms for multimedia recommendation within Elliot, a rigorous framework for evaluating recommender systems, where we re-implement such multimedia recommendation approaches. Finally, we highlight the significant unresolved challenges in multimodal deep learning for multimedia recommendation and suggest possible avenues for addressing them. The primary aim of this work is to provide guidelines for designing and implementing the next generation of multimodal approaches in multimedia recommendation.

Publisher

Association for Computing Machinery (ACM)

Reference145 articles.

1. Himan Abdollahpouri Robin Burke and Bamshad Mobasher. 2017. Controlling Popularity Bias in Learning-to-Rank Recommendation. In RecSys. ACM 42–46.

2. Vito Walter Anelli, Alejandro Bellogín, Antonio Ferrara, Daniele Malitesta, Felice Antonio Merra, Claudio Pomo, Francesco Maria Donini, and Tommaso Di Noia. 2021. Elliot: A Comprehensive and Rigorous Framework for Reproducible Recommender Systems Evaluation. In SIGIR. ACM, 2405–2414.

3. Vito Walter Anelli Yashar Deldjoo Tommaso Di Noia Eugenio Di Sciascio Antonio Ferrara Daniele Malitesta and Claudio Pomo. 2022. Reshaping Graph Recommendation with Edge Graph Collaborative Filtering and Customer Reviews. In DL4SR@CIKM(CEUR Workshop Proceedings Vol.  3317). CEUR-WS.org.

4. Sanjeev Arora Yingyu Liang and Tengyu Ma. 2017. A Simple but Tough-to-Beat Baseline for Sentence Embeddings. In ICLR (Poster). OpenReview.net.

5. Matteo Attimonelli Danilo Danese Daniele Malitesta Claudio Pomo Giuseppe Gassi and Tommaso Di Noia. 2024. Ducho 2.0: Towards a More Up-to-Date Unified Framework for the Extraction of Multimodal Features in Recommendation. CoRR abs/2403.04503(2024).

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Promoting Green Fashion Consumption in Recommender Systems;Adjunct Proceedings of the 32nd ACM Conference on User Modeling, Adaptation and Personalization;2024-06-27

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3