MultArtRec: A Multimodal Neural Topic Modeling for Integrating Image and Text Features in Artwork Recommendation
-
Published:2024-01-10
Issue:2
Volume:13
Page:302
-
ISSN:2079-9292
-
Container-title:Electronics
-
language:en
-
Short-container-title:Electronics
Author:
Wang Jiayun1ORCID, Maeda Akira2ORCID, Kawagoe Kyoji2ORCID
Affiliation:
1. Graduate School of Information Science and Engineering, Ritsumeikan University, Shiga 525-8577, Japan 2. College of Information Science and Engineering, Ritsumeikan University, Shiga 525-8577, Japan
Abstract
Recommender systems help users obtain the content they need from massive amounts of information. Artwork recommender systems is a topic that has attracted attention. However, existing art recommender systems rarely consider user preferences and multimodal information at the same time, while utilizing all the information has the potential to help make better personalized recommendations. To better apply recommender systems to the artwork-recommendation scenario, we propose a new neural topic modeling (NTM)-based multimodal artwork recommender system (MultArtRec), that can take all the information into account at the same time and extract effective features representing user preferences from multimodal content. Also, to improve MultArtRec’s performance on monomodal feature extraction, we add a novel topic loss term to the conventional NTM loss. The first two experiments in this study compare the performances of different models with different monomodal inputs. The results show that MultArtRec can improve the performance with image modality inputs by up to 174.8% compared to the second-best model and improve the performance with text modality inputs by up to 10.7% compared to the second-best model. The third experiment is conducted to compare the performance of MultArtRec with monomodal inputs and multimodal inputs. The results show that the performance of MultArtRec with multimodal inputs can be improved by up to 15.9% compared to monomodal inputs. The last experiment preliminarily tests the versatility of MultArtRec on a fashion recommendation scenario that considers clothing image content and user preferences. The results show that MultArtRec outperforms the other methods across all the metrics.
Funder
JSPS KAKENHI Grant
Reference42 articles.
1. Strezoski, G., Fijen, L., Mitnik, J., László, D., Oyens, P.D.M., Schirris, Y., and Worring, M. (2020, January 12–16). TindART: A personal visual arts recommender. Proceedings of the 28th ACM International Conference on Multimedia, Seattle, WA, USA. 2. Messina, P., Cartagena, M., Cerda-Mardini, P., del Rio, F., and Parra, D. (2020). Curatornet: Visually-aware recommendation of art images. arXiv. 3. Pal, A., Eksombatchai, C., Zhou, Y., Zhao, B., Rosenberg, C., and Leskovec, J. (2020, January 6–10). Pinnersage: Multi-modal user embedding framework for recommendations at pinterest. Proceedings of the 26th ACM SIGKDD, Virtual. 4. Deldjoo, Y., Nazary, F., Ramisa, A., Mcauley, J., Pellegrini, G., Bellogin, A., and Di Noia, T. (2022). A review of modern fashion recommender systems. arXiv. 5. Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., and Clark, J. (2021, January 1). Learning transferable visual models from natural language supervision. Proceedings of the 38th International Conference on Machine Learning, Virtual.
|
|