Estimation of habit-related information from male voice data using machine learning-based methods-Reference-Cited by-同舟云学术

Estimation of habit-related information from male voice data using machine learning-based methods

Published:2023-06-01 Issue:3 Volume:28 Page:520-529
ISSN:1433-5298
Container-title:Artificial Life and Robotics
language:en
Short-container-title:Artif Life Robotics

Author:

Yokoo Takaya,Hatano Ryo,Nishiyama Hiroyuki

Abstract

AbstractAccording to a survey on the cause of death among Japanese people, lifestyle-related diseases (such as malignant neoplasms, cardiovascular diseases, and pneumonia) account for 55.8% of all deaths. Three habits, namely, drinking, smoking, and sleeping, are considered the most important factors associated with lifestyle-related diseases, but it is difficult to measure these habits autonomously and regularly. Here, we propose a machine learning-based approach for detecting these lifestyle habits using voice data. We used classifiers and probabilistic linear discriminant analysis based on acoustic features, such as mel-frequency cepstrum coefficients (MFCCs) and jitter, extracted from a speech dataset we developed, and an X-vector from a pre-trained ECAPA-TDNN model. For training models, we used several classifiers implemented in MATLAB 2021b, such as support vector machines, K-nearest neighbors (KNN), and ensemble methods with some feature-projection options. Our results show that a cubic KNN method using acoustic features performs well on the sleep habit classification, while X-vector-based models perform well on smoking and drinking habit classifications. These results suggest that X-vectors may help estimate factors directly affecting the vocal cords and vocal tracts of the users (e.g., due to smoking and drinking), while acoustic features may help classify chronotypes, which might be informative with respect to the individuals’ vocal cord and vocal tract ultrastructure.

Funder

Tokyo University of Science

Publisher

Springer Science and Business Media LLC

Subject

Artificial Intelligence,General Biochemistry, Genetics and Molecular Biology

Link

https://link.springer.com/content/pdf/10.1007/s10015-023-00870-2.pdf

Reference23 articles.

1. Alcohol Health and Medical Association: Alcohol blood levels and drunkenness. http://www.arukenkyo.or.jp/health/base/index.html Accessed 12 Nov 2021

2. Chung JS, Nagrani A, Zisserman A (2018) Voxceleb2: Deep speaker recognition. arXiv preprint arXiv:1806.05622

3. Desplanques B, Thienpondt J, Demuynck K (2020) ECAPA-TDNN: Emphasized channel attention, propagation and aggregation in TDNN based speaker verification. arXiv preprint arXiv:2005.07143

4. Doukhan D, Carrive J, Vallet F, Larcher A, Meignier S (2018) An open-source speaker gender detection framework for monitoring gender equality. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5214–5218. 10.1109/ICASSP.2018.8461471

5. Faurholt-Jepsen M, Rohani DA, Busk J, Vinberg M, Bardram JE, Kessing LV (2021) Voice analyses using smartphone-based data in patients with bipolar disorder, unaffected relatives and healthy control individuals, and during different affective states. Int J Bipolar Disord 9(1):1–13