Abstract
AbstractThe emergence of biometric technology provides enhanced security compared to the traditional identification and authentication techniques that were less efficient and secure. Despite the advantages brought by biometric technology, the existing biometric systems such as Automatic Speaker Verification (ASV) systems are weak against presentation attacks. A presentation attack is a spoofing attack launched to subvert an ASV system to gain access to the system. Though numerous Presentation Attack Detection (PAD) systems were reported in the literature, a systematic survey that describes the current state of research and application is unavailable. This paper presents a systematic analysis of the state-of-the-art voice PAD systems to promote further advancement in this area. The objectives of this paper are two folds: (i) to understand the nature of recent work on PAD systems, and (ii) to identify areas that require additional research. From the survey, a taxonomy of voice PAD and the trend analysis of recent work on PAD systems were built and presented, whereby the recent and relevant articles including articles from Interspeech and ICASSP Conferences, mostly indexed by Scopus, published between 2015 and 2021 were considered. A total of 172 articles were surveyed in this work. The findings of this survey present the limitation of recent works, which include spoof-type dependent PAD. Consequently, the future direction of work on voice PAD for interested researchers is established. The findings of this survey present the limitation of recent works, which include spoof-type dependent PAD. Consequently, the future direction of work on voice PAD for interested researchers is established.
Publisher
Springer Science and Business Media LLC
Subject
Computer Networks and Communications,Hardware and Architecture,Media Technology,Software
Reference145 articles.
1. Abozaid A, Haggag A, Kasban H, Eltokhy M (2018) Multimodal biometric scheme for human authentication technique based on voice and face recognition fusion. Multimedia Tools and Applications. https://doi.org/10.1007/s11042-018-7012-3
2. Adel M, Afify M, Gaballah A (2018) Text-Independent Speaker Verification Based on Deep Neural Networks and Segmental Dynamic Time Warping. 2018 IEEE Spoken Language Technology Workshop (SLT), pp 1001–1006, 1806.09932
3. Adiban M, Sameti H, Shehnepoor S (2020) Replay spoofing countermeasure using autoencoder and siamese networks on ASVspoof 2019 challenge. Computer Speech & Language 64:101105. https://doi.org/10.1016/j.csl.2020.101105
4. Admuthe SS, Ghugardare S (2015) Survey paper on automatic speaker recognition systems. In: International conference on multimedia, computer graphics, and broadcasting international conference on signal processing, image processing, and pattern recognition, vol 4, pp 10895–10898
5. Al-Ali AKH, Senadji B, Naik GR (2017) Enhanced forensic speaker verification using multi-run ICA in the presence of environmental noise and reverberation conditions. In: 2017 IEEE International conference on signal and image processing applications (ICSIPA), IEEE, pp 174–179. https://doi.org/10.1109/ICSIPA.2017.8120601
Cited by
20 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献