Abstract
AbstractWe investigate the use of Minimax distances to extract in a nonparametric way the features that capture the unknown underlying patterns and structures in the data. We develop a general-purpose and computationally efficient framework to employ Minimax distances with many machine learning methods that perform on numerical data. We study both computing the pairwise Minimax distances for all pairs of objects and as well as computing the Minimax distances of all the objects to/from a fixed (test) object. We first efficiently compute the pairwise Minimax distances between the objects, using the equivalence of Minimax distances over a graph and over a minimum spanning tree constructed on that. Then, we perform an embedding of the pairwise Minimax distances into a new vector space, such that their squared Euclidean distances in the new space equal to the pairwise Minimax distances in the original space. We also study the case of having multiple pairwise Minimax matrices, instead of a single one. Thereby, we propose an embedding via first summing up the centered matrices and then performing an eigenvalue decomposition to obtain the relevant features. In the following, we study computing Minimax distances from a fixed (test) object which can be used for instance in K-nearest neighbor search. Similar to the case of all-pair pairwise Minimax distances, we develop an efficient and general-purpose algorithm that is applicable with any arbitrary base distance measure. Moreover, we investigate in detail the edges selected by the Minimax distances and thereby explore the ability of Minimax distances in detecting outlier objects. Finally, for each setting, we perform several experiments to demonstrate the effectiveness of our framework.
Funder
Knut och Alice Wallenbergs Stiftelse
Publisher
Springer Science and Business Media LLC
Subject
Artificial Intelligence,Software
Reference39 articles.
1. Aho, A. V., & Hopcroft, J. E. (1974). The design and analysis of computer algorithms (1st ed.). Boston, MA: Addison-Wesley Longman Publishing Co., Inc.
2. Chang, H., & Yeung, D.-Y. (2008). Robust path-based spectral clustering. Pattern Recognition, 41(1), 191–203.
3. Chebotarev, P. (2011). A class of graph-geodetic distances generalizing the shortest-path and the resistance distances. Discrete Applied Mathematics, 159(5), 295–302.
4. Chehreghani, M. H. (2017). Efficient computation of pairwise minimax distance measures. In 2017 IEEE international conference on data mining, ICDM (pp. 799–804). IEEE Computer Society.
5. Chehreghani, M. H. (2020). Hierarchical correlation clustering and tree preserving embedding. CoRR, abs/2002.07756.
Cited by
8 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献