Affiliation:
1. Shanghai Jiao Tong University, China
2. National Chiao Tung University, Taiwan
3. National Taiwan University, Taiwan
Abstract
Using silent speech to issue commands has received growing attention, as users can utilize existing command sets from voice-based interfaces without attracting other people's attention. Such interaction maintains privacy and social acceptance from others. However, current solutions for recognizing silent speech mainly rely on camera-based data or attaching sensors to the throat. Camera-based solutions require 5.82 times larger power consumption or have potential privacy issues; attaching sensors to the throat is not practical for commercial-off-the-shell (COTS) devices because additional sensors are required. In this paper, we propose a sensing technique that only needs a microphone and a speaker on COTS devices, which not only consumes little power but also has fewer privacy concerns. By deconstructing the received acoustic signals, a 2D motion profile can be generated. We propose a classifier based on convolutional neural networks (CNN) to identify the corresponding silent command from the 2D motion profiles. The proposed classifier can adapt to users and is robust when tested by environmental factors. Our evaluation shows that the system achieves 92.5% accuracy in classifying 20 commands.
Funder
Ministry of Science and Technology of Taiwan
National Chiao Tung University
Startup Fund for Youngman Research at SJTU
Joint Key Project of the NSFC
National Key R&D Program of China
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Networks and Communications,Hardware and Architecture,Human-Computer Interaction
Cited by
27 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. EarSSR: Silent Speech Recognition via Earphones;IEEE Transactions on Mobile Computing;2024-08
2. OpenAuth: Human Body-Based User Authentication Using mmWave Signals in Open-World Scenarios;2024 IEEE 44th International Conference on Distributed Computing Systems (ICDCS);2024-07-23
3. RFSpy: Eavesdropping on Online Conversations with Out-of-Vocabulary Words by Sensing Metal Coil Vibration of Headsets Leveraging RFID;Proceedings of the 22nd Annual International Conference on Mobile Systems, Applications and Services;2024-06-03
4. Lipwatch: Enabling Silent Speech Recognition on Smartwatches using Acoustic Sensing;Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies;2024-05-13
5. Sensing to Hear through Memory;Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies;2024-05-13