Abstract
AbstractIn recent years, supervised machine learning models trained on videos of animals with pose estimation data and behavior labels have been used for automated behavior classification. Applications include, for example, automated detection of neurological diseases in animal models. However, there are two problems with these supervised learning models. First, such models require a large amount of labeled data but the labeling of behaviors frame by frame is a laborious manual process that is not easily scalable. Second, such methods rely on handcrafted features obtained from pose estimation data that are usually designed empirically. In this paper, we propose to overcome these two problems using contrastive learning for self-supervised feature engineering on pose estimation data. Our approach allows the use of unlabeled videos to learn feature representations and reduce the need for handcrafting of higher-level features from pose positions. We show that this approach to feature representation can achieve better classification performance compared to handcrafted features alone, and that the performance improvement is due to contrastive learning on unlabeled data rather than the neural network architecture.Author SummaryAnimal models are widely used in medicine to study diseases. For example, the study of social interactions between animals such as mice are used to investigate changes in social behaviors in neurological diseases. The process of manually annotating animal behaviors from videos is slow and tedious. To solve this problem, machine learning approaches to automate the video annotation process have become more popular. Many of the recent machine learning approaches are built on the advances in pose-estimation technology which enables accurate localization of key points of the animals. However, manual labeling of behaviors frame by frame for the training set is still a bottleneck that is not scalable. Also, existing methods rely on handcrafted feature engineering from pose estimation data. In this study, we propose ConstrastivePose, an approach using contrastive learning to learn feature representation from unlabeled data. We demonstrate the improved performance using the features learnt by our method versus handcrafted features for supervised learning. This approach can be helpful for work seeking to build supervised behavior classification models where behavior labelled videos are scarce.
Publisher
Cold Spring Harbor Laboratory
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献