Affiliation:
1. School of Electronics and Information Engineering, Korea Aerospace University, 76, Hanggongdaehak-ro, Deogyang-gu, Goyang-si 10540, Gyeonggi-do, Republic of Korea
Abstract
This paper proposes a resource-efficient keyword spotting (KWS) system based on a convolutional neural network (CNN). The end-to-end KWS process is performed based solely on 1D-CNN inference, where features are first extracted from a few convolutional blocks, and then the keywords are classified using a few fully connected blocks. The 1D-CNN model is binarized to reduce resource usage, and its inference is executed by employing a dedicated engine. This engine is designed to skip redundant operations, enabling high inference speed despite its low complexity. The proposed system is implemented using 6895 ALUTs in an Intel Cyclone V FPGA by integrating the essential components for performing the KWS process. In the system, the latency required to process a frame is 22 ms, and the spotting accuracy is 91.80% in an environment where the signal-to-noise ratio is 10 dB for Google speech commands dataset version 2.
Funder
ABOV Semiconductor
Korean Government
IC Design Education Center
Subject
Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering
Reference24 articles.
1. Deep spoken keyword spotting: An overview;Tan;IEEE Access,2021
2. Han, W., Chan, C.F., Choy, C.S., and Pun, K.P. (2006, January 21–24). An efficient MFCC extraction method in speech recognition. Proceedings of the IEEE International Symposium on Circuits and Systems, Kos, Greece.
3. Giraldo, J.S.P., and Verhelst, M. (2018, January 3–6). Laika: A 5 uW programmable LSTM accelerator for always-on keyword spotting in 65 nm CMOS. Proceedings of the European Solid State Circuits Conference 2018—IEEE 44th, Dresden, Germany.
4. AAD-KWS: A sub-μW keyword spotting chip With an acoustic activity detector embedded in MFCC and a tunable detection window in 28-nm CMOS;Shan;IEEE J. Solid-State Circuits,2022
5. He, K., Chen, D., and Su, T. (2022). A configurable accelerator for keyword spotting based on small-footprint temporal efficient neural network. Electronics, 11.