Author:
Bianchi Enrico,Penna Paolo
Abstract
This work studies clustering algorithms which operates with ordinal or comparison-based queries (operations), a situation that arises in many active-learning applications where “dissimilarities” between data points are evaluated by humans. Typically, exact answers are costly (or difficult to obtain in large amounts) while possibly erroneous answers have low cost. Motivated by these considerations, we study algorithms with non-trivial trade-offs between the number of exact (high-cost) operations and noisy (low-cost) operations with provable performance guarantees. Specifically, we study a class of polynomial-time graph-based clustering algorithms (termed Single-Linkage) which are widely used in practice and that guarantee exact solutions for stable instances in several clustering problems (these problems are NP-hard in the worst case). We provide several variants of these algorithms using ordinal operations and, in particular, non-trivial trade-offs between the number of high-cost and low-cost operations that are used. Our algorithms still guarantee exact solutions for stable instances of k-medoids clustering, and they use a rather small number of high-cost operations, without increasing the low-cost operations too much.
Subject
Computational Mathematics,Computational Theory and Mathematics,Numerical Analysis,Theoretical Computer Science
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献