MRBERT: Pre-Training of Melody and Rhythm for Automatic Music Generation
-
Published:2023-02-04
Issue:4
Volume:11
Page:798
-
ISSN:2227-7390
-
Container-title:Mathematics
-
language:en
-
Short-container-title:Mathematics
Author:
Li Shuyu1, Sung Yunsick2ORCID
Affiliation:
1. Department of Multimedia Engineering, Graduate School, Dongguk University–Seoul, Seoul 04620, Republic of Korea 2. Department of Multimedia Engineering, Dongguk University–Seoul, Seoul 04620, Republic of Korea
Abstract
Deep learning technology has been extensively studied for its potential in music, notably for creative music generation research. Traditional music generation approaches based on recurrent neural networks cannot provide satisfactory long-distance dependencies. These approaches are typically designed for specific tasks, such as melody and chord generation, and cannot generate diverse music simultaneously. Pre-training is used in natural language processing to accomplish various tasks and overcome the limitation of long-distance dependencies. However, pre-training is not yet widely used in automatic music generation. Because of the differences in the attributes of language and music, traditional pre-trained models utilized in language modeling cannot be directly applied to music fields. This paper proposes a pre-trained model, MRBERT, for multitask-based music generation to learn melody and rhythm representation. The pre-trained model can be applied to music generation applications such as web-based music composers that includes the functions of melody and rhythm generation, modification, completion, and chord matching after being fine-tuned. The results of ablation experiments performed on the proposed model revealed that under the evaluation metrics of HITS@k, the pre-trained MRBERT considerably improved the performance of the generation tasks by 0.09–13.10% and 0.02–7.37%, compared to the usage of RNNs and the original BERT, respectively.
Funder
Ministry of Education of the Republic of Korea National Research Foundation of Korea
Subject
General Mathematics,Engineering (miscellaneous),Computer Science (miscellaneous)
Reference32 articles.
1. Multimodel Feature Reinforcement Framework Using Moore–Penrose Inverse for Big Data Analysis;Zhang;IEEE Trans. Neural Netw. Learn. Syst.,2020 2. Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., and Askell, A. (2020, January 6–12). Language Models are Few-Shot Learners. Proceedings of the 34th Advances in Neural Information Processing Systems (NeurIPS), Online. 3. Dong, H.W., Hsiao, W.Y., Yang, L.C., and Yang, Y.H. (2018, January 2–7). MuseGan: Multi-Track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment. Proceedings of the 32nd AAAI Conference on Artificial Intelligence, New Orleans, LA, USA. 4. Li, S., Jang, S., and Sung, Y. (2019). Automatic Melody Composition Using Enhanced GAN. Mathematics, 7. 5. Choi, K., Fazekas, G., Sandler, M., and Cho, K. (2017, January 5–9). Convolutional Recurrent Neural Networks for Music Classification. Proceedings of the 2017 IEEE 42nd International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, LA, USA.
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|