Effect of Spectrogram Parameters and Noise Types on The Performance of Spectro-temporal Peaks Based Audio Search Method
Author:
KÖSEOĞLU Murat1, UYANIK Hakan2
Affiliation:
1. İnönü Üniversitesi 2. MUNZUR UNIVERSITY, FACULTY OF ENGINEERING
Abstract
Audio search algorithms are used to detect the matching file in large databases, especially in multimedia applications and smart appliances. Based on different audio fingerprint extraction methods, similar algorithms have been developed and applied in different fields. These algorithms are expected to perform the detection in a reliable and robust way within the possible shortest time. In this study, based on spectral peaks method, an audio fingerprint algorithm with a few minor modifications was developed to accurately detect the matching audio file in the target database. The algorithm was demonstrated and then the effect of spectrogram parameters such as window size, overlap and number of FFT was investigated in terms of reliability and robustness of the program under three different noise sources. The database was a relatively small size one with five genres of music. In this study, it was aimed to contribute to new audio file detection studies based on spectral peaks method. It was observed that the the variation in the spectrogram parameters have significantly affected the number of matchings (NM), reliability and robustness. Under high noise conditions the optimal spectrogram parameters were determined as 512;50%;512, respectively. We did not observe, however, a significant effect of music genre on NM.
Publisher
Gazi University Journal of Science
Subject
Multidisciplinary,General Engineering
Reference39 articles.
1. [1] Grosche, P., Müller, M., Serra, J., "Audio Content-Based Music Retrieval", M. Müller, M. Goto, M. Schedl (Eds.), Dagstuhl Follow-Ups, Multimodal Music Process, 157–174, (2012). 2. [2] Casey, M.A., Veltkamp, R., Goto, M., Leman, M., Rhodes, C., Slaney, M., "Content-Based Music Information Retrieval: Current Directions and Future Challenges", Proceedings of the IEEE, 96(4): 668–696, (2008). 3. [3] Cano, P., Batlle, E., Kalker, T., Haitsma, J., "A Review of Audio Fingerprinting", Journal of VLSI Signal Processing Systems for Signal, Image and Video Technology, 41(3): 271–284, (2005). 4. [4] Cano, P., Battle, E., Mayer, H., Neuschmied, H., "Robust Sound Modeling for Song Detection in Broadcast Audio", AES 112th Conv., Munich, 1–7, (2002). 5. [5] Haitsma, J., Kalker, T., "A Highly Robust Audio Fingerprinting System", ISMIR 2002, 3rd Int. Conf. Music Inf. Retrieval, Paris, 1–9, (2002).
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Genre Classification of Movie Trailers using Spectrogram Analysis and Machine Learning;2024 IEEE International Black Sea Conference on Communications and Networking (BlackSeaCom);2024-06-24 2. Audio Event Recognition Involving Animals and Bird Species Using Machine Learning;2023 3rd International Conference on Smart Generation Computing, Communication and Networking (SMART GENCON);2023-12-29 3. Hyperparameter Tuning On Machine Learning Transformers For Mood Classification In Indonesian Music;2023 International Conference on Informatics, Multimedia, Cyber and Informations System (ICIMCIS);2023-11-07 4. Handcrafted Feature From Classification Mood Music Indonesia With Machine Learning BERT and Transformer;2023 International Conference on Informatics, Multimedia, Cyber and Informations System (ICIMCIS);2023-11-07 5. Automated Hypertension Detection Using ConvMixer and Spectrogram Techniques with Ballistocardiograph Signals;Diagnostics;2023-01-04
|
|