Abstract
AbstractThe electrical penetration graph (EPG) is a well-known technique that provides insights into the feeding behavior of insects with piercing-sucking mouthparts, mostly hemipterans. Since its inception in the 1960s, EPG has become indispensable in studying plant-insect interactions, revealing critical information about host plant selection, plant resistance, virus transmission, and responses to environmental factors. By integrating the plant and insect into an electrical circuit, EPG allows researchers to identify specific feeding behaviors based on distinct waveform patterns associated with activities within plant tissues. However, the traditional manual analysis of EPG waveform data is time-consuming and labor-intensive, limiting research throughput.This study presents a novel machine-learning approach to automate the segmentation and classification of EPG signals. We rigorously evaluated six diverse machine learning models, including neural networks, tree-based models, and logistic regressions, using an extensive dataset from aphid feeding experiments. Our results demonstrate that a Residual Network (ResNet) architecture achieved the highest overall waveform classification accuracy of 96.8% and highest segmentation overlap rate of 84.4%, highlighting the potential of machine learning for accurate and efficient EPG analysis. This automated approach promises to accelerate research in this field significantly and has the potential to be generalized to other insect species and experimental settings. Our findings underscore the value of applying advanced computational techniques to complex biological datasets, paving the way for a more comprehensive understanding of insect-plant interactions and their broader ecological implications. The source code for all experiments conducted within this study is publicly available athttps://github.com/HySonLab/ML4Insects.Author summaryInsect pests of the order Hemiptera pose a significant threat to global agriculture, causing substantial crop losses due to direct feeding and serving as vectors for many economically important plant viruses. Understanding plant-insect interactions is crucial for mitigating these impacts. The electrical penetration graph (EPG) is a valuable tool that provides detailed insights into these interactions. However, the analysis of EPG data is a time-consuming, labor-intensive process that can also be prone to operator errors. State-of-the-art machine learning (ML) algorithms can be trained to perform this task accurately and consistently. These advanced algorithms can automate identifying and classifying specific EPG waveform patterns associated with distinct insect feeding behaviors. Our machine learning models, trained on extensive aphid feeding data demonstrated high accuracy in classifying these waveforms, with Residual Network (ResNet) architecture achieving the best performance. The automated approach saves time and resources, eliminates operator error, and also enables the identification of novel feeding patterns, providing a deeper understanding of the mechanisms underlying plant-aphid interactions. Moreover, our evaluation of a large, diverse dataset of four aphid species on four host plants indicates the potential for generalizing these models to different experimental settings. By applying advanced computational techniques to EPG data, we are pioneering the intelligent surveillance of aphid feeding habits. This approach promises to significantly enhance our efforts in developing a better understanding of factors that affect aphid feeding.
Publisher
Cold Spring Harbor Laboratory