Affiliation:
1. Faculty of Information Technology, Beijing University of Technology, Beijing 100124, P. R. China
Abstract
Music content has recently been identified as useful information to promote the performance of music recommendations. Existing studies usually feed low-level audio features, such as the Mel-frequency cepstral coefficients, into deep learning models for music recommendations. However, such features cannot well characterize music audios, which often contain multiple sound sources. In this paper, we propose to model and fuse chord, melody, and rhythm features to meaningfully characterize the music so as to improve the music recommendation. Specially, we use two user-based attention mechanisms to differentiate the importance of different parts of audio features and chord features. In addition, a Long Short-Term Memory layer is used to capture the sequence characteristics. Those features are fused by a multilayer perceptron and then used to make recommendations. We conducted experiments with a subset of the last.fm-1b dataset. The experimental results show that our proposal outperforms the best baseline by [Formula: see text] on HR@10.
Funder
National Natural Science Foundation of China
Importation and Development of High-Caliber Talents Project of Beijing Municipal Institution
Beijing Municipal Education Commission
Natural Science Foundation of Beijing Municipality
Publisher
World Scientific Pub Co Pte Ltd
Subject
Artificial Intelligence,Computer Graphics and Computer-Aided Design,Computer Networks and Communications,Software
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献