Affiliation:
1. Department of Statistics Seoul National University Seoul Republic of Korea
2. DataShape Team Inria Saclay Palaiseau France
3. Laboratoire de Mathématiques d'Orsay Université Paris‐Saclay Orsay France
Abstract
AbstractThis paper addresses the problem of identifying modes or density bumps in multivariate angular or circular data, which have diverse applications in fields like medicine, biology and physics. We focus on the use of topological data analysis and persistent homology for this task. Specifically, we extend the methods for uncertainty quantification in the context of a torus sample space, where circular data lie. To achieve this, we employ two types of density estimators, namely, the von Mises kernel density estimator and the von Mises mixture model, to compute persistent homology, and propose a scale‐space view for searching significant bumps in the density. The results of bump hunting are summarised and visualised through a scale‐space diagram. Our approach using the mixture model for persistent homology offers advantages over conventional methods, allowing for dendrogram visualisation of components and identification of mode locations. For testing whether a detected mode is really there, we propose several inference tools based on bootstrap resampling and concentration inequalities, establishing their theoretical applicability. Experimental results on SARS‐CoV‐2 spike glycoprotein torsion angle data demonstrate the effectiveness of our proposed methods in practice.
Funder
National Research Foundation of Korea
Subject
Statistics, Probability and Uncertainty,Statistics and Probability
Reference27 articles.
1. Topology and data
2. Robust topological inference: Distance to a measure and kernel distance;Chazal F.;Journal of Machine Learning Research,2018
3. Stability of Persistence Diagrams