Author:
Christodoulou Anna-Maria,Lartillot Olivier,Jensenius Alexander Refsum
Abstract
AbstractThe term “multimodal music dataset” is often used to describe music-related datasets that represent music as a multimedia art form and multimodal experience. However, the term “multimodality” is often used differently in disciplines such as musicology, music psychology, and music technology. This paper proposes a definition of multimodality that works across different music disciplines. Many challenges are related to constructing, evaluating, and using multimodal music datasets. We provide a task-based categorization of multimodal datasets and suggest guidelines for their development. Diverse data pre-processing methods are illuminated, highlighting their contributions to transparent and reproducible music analysis. Additionally, evaluation metrics, methods, and benchmarks tailored for multimodal music processing tasks are scrutinized, empowering researchers to make informed decisions and facilitating cross-study comparisons.
Publisher
Springer Science and Business Media LLC
Reference97 articles.
1. Abhyankar SG, Bharadwaj SS, Rani GS, et al (2023) A survey on music genre classification using multimodal information processing and retrieval. In: Proceedings of the 2023 international conference on recent trends in electronics and communication (ICRTEC). IEEE, Mysore, India, pp 1–6. https://doi.org/10.1109/ICRTEC56977.2023.10111926
2. Agostinelli A, Denk TI, Borsos Z, et al (2023) MusicLM: generating music from text. https://doi.org/10.48550/arXiv.2301. 91111325
3. Alfaro-Contreras M, Valero-Mas JJ, Iñesta JM et al (2023) Late multimodal fusion for image and audio music transcription. Expert Syst Appl 216:119491. https://doi.org/10.1016/j.eswa.2022.119491
4. Arola KL, Sheppard J, Ball CE (2014) Writer/designer: a guide to making multimodal projects. Bedford/St. Martins, Boston
5. Aryafar K, Shokoufandeh A (2014) Multimodal music and lyrics fusion classifier for artist identification. In: 2014 13th international conference on machine learning and applications. IEEE, Detroit, MI, USA, pp 506–509, https://doi.org/10.1109/ICMLA.2014.88