Affiliation:
1. Department of Mathematics, University of California Santa Barbara, Santa Barbara, CA 93106, USA
Abstract
A new method for hierarchical clustering of data points is presented. It combines treelets, a particular multiresolution decomposition of data, with a mapping on a reproducing kernel Hilbert space. The proposed approach, called kernel treelets (KT), uses this mapping to go from a hierarchical clustering over attributes (the natural output of treelets) to a hierarchical clustering over data. KT effectively substitutes the correlation coefficient matrix used in treelets with a symmetric and positive semi-definite matrix efficiently constructed from a symmetric and positive semi-definite kernel function. Unlike most clustering methods, which require data sets to be numeric, KT can be applied to more general data and yields a multiresolution sequence of orthonormal bases on the data directly in feature space. The effectiveness and potential of KT in clustering analysis are illustrated with some examples.
Funder
National Science Foundation
Publisher
World Scientific Pub Co Pte Lt