Author:
Mwanga Emmanuel P.,Siria Doreen J.,Mshani Issa H.,Mwinyi Sophia H.,Abbasi Said,Jimenez Mario Gonzalez,Wynne Klaas,Baldini Francesco,Babayan Simon A.,Okumu Fredros O.
Abstract
Abstract
Background
Accurately determining the age and survival probabilities of adult mosquitoes is crucial for understanding parasite transmission, evaluating the effectiveness of control interventions and assessing disease risk in communities. This study was aimed at demonstrating the rapid identification of epidemiologically relevant age categories of Anopheles funestus, a major Afro-tropical malaria vector, through the innovative combination of infrared spectroscopy and machine learning, instead of the cumbersome practice of dissecting mosquito ovaries to estimate age based on parity status.
Methods
Anopheles funestus larvae were collected in rural south-eastern Tanzania and reared in an insectary. Emerging adult females were sorted by age (1–16 days old) and preserved using silica gel. Polymerase chain reaction (PCR) confirmation was conducted using DNA extracted from mosquito legs to verify the presence of An. funestus and to eliminate undesired mosquitoes. Mid-infrared spectra were obtained by scanning the heads and thoraces of the mosquitoes using an attenuated total reflection–Fourier transform infrared (ATR–FT-IR) spectrometer. The spectra (N = 2084) were divided into two epidemiologically relevant age groups: 1–9 days (young, non-infectious) and 10–16 days (old, potentially infectious). The dimensionality of the spectra was reduced using principal component analysis, and then a set of machine learning and multi-layer perceptron (MLP) models were trained using the spectra to predict the mosquito age categories.
Results
The best-performing model, XGBoost, achieved overall accuracy of 87%, with classification accuracy of 89% for young and 84% for old An. funestus. When the most important spectral features influencing the model performance were selected to train a new model, the overall accuracy increased slightly to 89%. The MLP model, utilizing the significant spectral features, achieved higher classification accuracy of 95% and 94% for the young and old An. funestus, respectively. After dimensionality reduction, the MLP achieved 93% accuracy for both age categories.
Conclusions
This study shows how machine learning can quickly classify epidemiologically relevant age groups of An. funestus based on their mid-infrared spectra. Having been previously applied to An. gambiae, An. arabiensis and An. coluzzii, this demonstration on An. funestus underscores the potential of this low-cost, reagent-free technique for widespread use on all the major Afro-tropical malaria vectors. Future research should demonstrate how such machine-derived age classifications in field-collected mosquitoes correlate with malaria in human populations.
Graphical Abstract
Funder
Medical Research Council
Wellcome Trust
Academy Medical Sciences Springboard Award
Bill and Melinda Gates Foundation
Royal Society
Howard Hughes Medical Institute
Publisher
Springer Science and Business Media LLC