DCNN for Pig Vocalization and Non-Vocalization Classification: Evaluate Model Robustness with New Data-Reference-Cited by-同舟云学术

DCNN for Pig Vocalization and Non-Vocalization Classification: Evaluate Model Robustness with New Data

Published:2024-07-09 Issue:14 Volume:14 Page:2029
ISSN:2076-2615
Container-title:Animals
language:en
Short-container-title:Animals

Author:

Pann Vandet¹^ORCID,Kwon Kyeong-seok¹^ORCID,Kim Byeonghyeon¹^ORCID,Jang Dong-Hwa¹,Kim Jong-Bok¹^ORCID

Affiliation:

1. Animal Environment Division, National Institute of Animal Science, Rural Development Administration, Wanju 55365, Republic of Korea

Abstract

Since pig vocalization is an important indicator of monitoring pig conditions, pig vocalization detection and recognition using deep learning play a crucial role in the management and welfare of modern pig livestock farming. However, collecting pig sound data for deep learning model training takes time and effort. Acknowledging the challenges of collecting pig sound data for model training, this study introduces a deep convolutional neural network (DCNN) architecture for pig vocalization and non-vocalization classification with a real pig farm dataset. Various audio feature extraction methods were evaluated individually to compare the performance differences, including Mel-frequency cepstral coefficients (MFCC), Mel-spectrogram, Chroma, and Tonnetz. This study proposes a novel feature extraction method called Mixed-MMCT to improve the classification accuracy by integrating MFCC, Mel-spectrogram, Chroma, and Tonnetz features. These feature extraction methods were applied to extract relevant features from the pig sound dataset for input into a deep learning network. For the experiment, three datasets were collected from three actual pig farms: Nias, Gimje, and Jeongeup. Each dataset consists of 4000 WAV files (2000 pig vocalization and 2000 pig non-vocalization) with a duration of three seconds. Various audio data augmentation techniques are utilized in the training set to improve the model performance and generalization, including pitch-shifting, time-shifting, time-stretching, and background-noising. In this study, the performance of the predictive deep learning model was assessed using the k-fold cross-validation (k = 5) technique on each dataset. By conducting rigorous experiments, Mixed-MMCT showed superior accuracy on Nias, Gimje, and Jeongeup, with rates of 99.50%, 99.56%, and 99.67%, respectively. Robustness experiments were performed to prove the effectiveness of the model by using two farm datasets as a training set and a farm as a testing set. The average performance of the Mixed-MMCT in terms of accuracy, precision, recall, and F1-score reached rates of 95.67%, 96.25%, 95.68%, and 95.96%, respectively. All results demonstrate that the proposed Mixed-MMCT feature extraction method outperforms other methods regarding pig vocalization and non-vocalization classification in real pig livestock farming.

Funder

Rural Development Administration, Republic of Korea

Publisher

MDPI AG

Link

https://www.mdpi.com/2076-2615/14/14/2029/pdf

Reference52 articles.

1. Domestic pig sound classification based on TransformerCNN;Liao;Appl. Intell.,2023

2. Pork market crisis in Romania: Pig livestock, pork production, consumption, import, export, trade balance and price;Popescu;Sci. Pap. Ser. Manag. Econ. Eng. Agric. Rural Dev.,2020

3. Liang, Y., Cheng, Y., Xu, Y., Hua, G., Zheng, Z., Li, H., and Han, L. (2022). Consumer preferences for animal welfare in China: Optimization of pork production-marketing chains. Animals, 12.

4. Hou, Y., Li, Q., Wang, Z., Liu, T., He, Y., Li, H., Ren, Z., Guo, X., Yang, G., and Liu, Y. (2024). Study on a Pig Vocalization Classification Method Based on Multi-Feature Fusion. Sensors, 24.

5. Dohlman, E., Hansen, J., and Boussios, D. (2022). USDA Agricultural Projections to 2031, United States Department of Agriculture.