Influence of Preprocessing Methods of Automated Milking Systems Data on the Prediction of Mastitis with Machine Learning Models-Reference-Cited by-同舟云学术

Influence of Preprocessing Methods of Automated Milking Systems Data on the Prediction of Mastitis with Machine Learning Models

Published:2024-06-25 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

B.O. Kashongwe¹,T. Kabelitz¹,T. Amon¹,C Ammon¹,B. Amon¹,M. Doherr²

Affiliation:

1. Leibniz Institut für Agrartechnik und Bioökonomie, e.V. (ATB)

2. Free University Berlin

Abstract

Missing data and class imbalance represent a hindrance to accurate prediction of rare events such as mastitis (udder inflammation). Various methods are susceptible to handle the problem, however, little is known about their individual and combined effects on the performance of ML models fitted to AMS (automated milking system) data for mastitis prediction. We apply imputation and resampling to improve performance metrics of classifiers (logistic regression, stochastic gradient descent, multilayer perceptron, decision tree and random forest). Three imputation methods: simple imputer (SI), multiple imputer (MICE) and linear interpolation (LI) were compared to complete cases. Three resampling procedures: synthetic minority oversampling technique (SOMTE), Support Vector Machine SMOTE and SMOTE with Edited Nearest Neighbours were compared. We evaluated different techniques by calculating precision, recall, F1 Score and compared models based on kappa score. Both imputation and resampling techniques improved models performance. Complete case analysis suited the Stochastic Gradient Descent (SGD) Classifier better than resampling or imputation (kappa=0.280). The Logistic regression (LR) performed better with SVMSMOTE rand no imputation (kappa= 0.218). The Random Forest (RF), Decision Tree (DT) and Multilayer Perceptron (MLP) performed better than SGD and LR and handled well class imbalance and missing values without preprocessing. We propose careful selection of the technique to handle class imbalance and missing value prior to subjecting data to ML model is crucial to attain best ML model performance.

Publisher

Research Square Platform LLC

Reference44 articles.

1. Risk Factors, Therapeutic Strategies, and Alternative Treatments—A Review;Cheng WN;Asian-Australasian J Anim Sci,2020

2. Herd-Level Mastitis-Associated Costs on Canadian Dairy Farms;Aghamohammadi M;Front Vet Sci,2018

3. Production Diseases Reduce the Efficiency of Dairy Production: A Review of the Results, Methods, and Approaches Regarding the Economics of Mastitis;Hogeveen H;Annual Rev Resource Econ,2019

4. Antibacterial Effect of Plant-Derived Antimicrobials on Major Bacterial Mastitis Pathogens in Vitro;Baskaran SA;J Dairy Sci,2009

5. Biosensors for On-Farm Diagnosis of Mastitis;Martins SA;Front Bioeng Biotechnol,2019