Optical music recognition for homophonic scores with neural networks and synthetic music generation

Author:

Alfaro-Contreras María,Iñesta José M.,Calvo-Zaragoza Jorge

Abstract

AbstractThe recognition of patterns that have a time dependency is common in areas like speech recognition or natural language processing. The equivalent situation in image analysis is present in tasks like text or video recognition. Recently, Convolutional Recurrent Neural Networks (CRNN) have been broadly applied to solve these tasks in an end-to-end fashion with successful performance. However, its application to Optical Music Recognition (OMR) is not so straightforward due to the presence of different elements sharing the same horizontal position, disrupting the linear flow of the timeline. In this paper, we study the ability of the state-of-the-art CRNN approach to learn codes that represent this disruption in homophonic scores. In our experiments, we study the lower bounds in the recognition task of real scores when the models are trained with synthetic data. Two relevant conclusions are drawn: (1) Our serialized ways of encoding the music content are appropriate for CRNN-based OMR; (2) the learning process is possible with synthetic data, but there exists a glass ceiling when recognizing real sheet music.

Funder

Universidad de Alicante

Publisher

Springer Science and Business Media LLC

Subject

Library and Information Sciences,Media Technology,Information Systems

Reference34 articles.

1. Alfaro Contreras M (2018) Construcción de un corpus de referencia para investigación en reconocimiento automático de partituras musicales. Technical report, Universidad de Alicante. (In Spanish)

2. Alfaro-Contreras M, Calvo-Zaragoza J, Iñesta JM (2019) Approaching end-to-end optical music recognition for homophonic scores. In: Iberian conference on pattern recognition and image analysis, pp 147–158. Springer

3. Alfaro-Contreras M, Rizo D, Iñesta JM, Calvo-Zaragoza J (2021) OMR-assisted transcription: a case study with early prints. In: Proceedings of the 22nd international society for music information retrieval conference, pp 35–41, Online. ISMIR

4. Bainbridge D, Bell T (2001) The challenge of optical music recognition. Comput Humanit 35(2):95–121

5. Baró A, Badal C, Fornés A (2020) Handwritten historical music recognition by sequence-to-sequence with attention mechanism. In: 17th International conference on frontiers in handwriting recognition, ICFHR 2020, Dortmund, Germany, 2020, pp 205–210

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3