Convergence of graph Laplacian with kNN self-tuned kernels-Reference-Cited by-同舟云学术

Convergence of graph Laplacian with kNN self-tuned kernels

Published:2021-09-27 Issue: Volume: Page:
ISSN:2049-8772
Container-title:Information and Inference: A Journal of the IMA
language:en
Short-container-title:

Author:

Cheng Xiuyuan¹,Wu Hau-Tieng²

Affiliation:

1. Department of Mathematics, Duke University, Durham, NC 27708, USA

2. Department of Mathematics and Department of Statistical Science, Duke University

Abstract

Abstract Kernelized Gram matrix $W$ constructed from data points $\{x_i\}_{i=1}^N$ as $W_{ij}= k_0( \frac{ \| x_i - x_j \|^2} {\sigma ^2} ) $ is widely used in graph-based geometric data analysis and unsupervised learning. An important question is how to choose the kernel bandwidth $\sigma $, and a common practice called self-tuned kernel adaptively sets a $\sigma _i$ at each point $x_i$ by the $k$-nearest neighbor (kNN) distance. When $x_i$s are sampled from a $d$-dimensional manifold embedded in a possibly high-dimensional space, unlike with fixed-bandwidth kernels, theoretical results of graph Laplacian convergence with self-tuned kernels have been incomplete. This paper proves the convergence of graph Laplacian operator $L_N$ to manifold (weighted-)Laplacian for a new family of kNN self-tuned kernels $W^{(\alpha )}_{ij} = k_0( \frac{ \| x_i - x_j \|^2}{ \epsilon \hat{\rho }(x_i) \hat{\rho }(x_j)})/\hat{\rho }(x_i)^\alpha \hat{\rho }(x_j)^\alpha $, where $\hat{\rho }$ is the estimated bandwidth function by kNN and the limiting operator is also parametrized by $\alpha $. When $\alpha = 1$, the limiting operator is the weighted manifold Laplacian $\varDelta _p$. Specifically, we prove the point-wise convergence of $L_N f $ and convergence of the graph Dirichlet form with rates. Our analysis is based on first establishing a $C^0$ consistency for $\hat{\rho }$ which bounds the relative estimation error $|\hat{\rho } - \bar{\rho }|/\bar{\rho }$ uniformly with high probability, where $\bar{\rho } = p^{-1/d}$ and $p$ is the data density function. Our theoretical results reveal the advantage of the self-tuned kernel over the fixed-bandwidth kernel via smaller variance error in low-density regions. In the algorithm, no prior knowledge of $d$ or data density is needed. The theoretical results are supported by numerical experiments on simulated data and hand-written digit image data.

Funder

National Science Foundation

Alfred P. Sloan Foundation

Publisher

Oxford University Press (OUP)

Subject

Applied Mathematics,Computational Theory and Mathematics,Numerical Analysis,Statistics and Probability,Analysis

Link

http://academic.oup.com/imaiai/advance-article-pdf/doi/10.1093/imaiai/iaab019/40442800/iaab019.pdf

Reference58 articles.

1. The isomap algorithm and topological stability;Balasubramanian;Science,2002

2. Laplacian eigenmaps for dimensionality reduction and data representation;Belkin;Neural Comput.,2003

3. Convergence of Laplacian eigenmaps;Belkin;Advances in Neural Information Processing Systems,2007

4. Measure-based diffusion grid construction and high-dimensional data discretization;Bermanis;Appl. Comput. Harmon. Anal.,2016

5. Variable bandwidth diffusion kernels;Berry;Appl. Comput. Harmon. Anal.,2016

Cited by 6 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Kernel two-sample tests for manifold data;Bernoulli;2024-11-01

2. The impact of dietary preference on household food waste: evidence from China;Frontiers in Nutrition;2024-07-09

3. Signed Graph Laplacian for Semi-Supervised Anomaly Detection;2024 International Conference on Artificial Intelligence in Information and Communication (ICAIIC);2024-02-19

4. Spatiotemporal analysis using Riemannian composition of diffusion operators;Applied and Computational Harmonic Analysis;2024-01

5. Eigen-convergence of Gaussian kernelized graph Laplacian by manifold heat interpolation;Applied and Computational Harmonic Analysis;2022-11