Affiliation:
1. Carnegie Mellon University, USA
2. University of Maryland, Baltimore County, USA
Abstract
In this paper, the authors present an empirical evaluation of similarity coefficients for binary valued data. Similarity coefficients provide a means to measure the similarity or distance between two binary valued objects in a dataset such that the attributes qualifying each object have a 0-1 value. This is useful in several domains, such as similarity of feature vectors in sensor networks, document search, router network mining, and web mining. The authors survey 35 similarity coefficients used in various domains and present conclusions about the efficacy of the similarity computed in (1) labeled data to quantify the accuracy of the similarity coefficients, (2) varying density of the data to evaluate the effect of sparsity of the values, and (3) varying number of attributes to see the effect of high dimensionality in the data on the similarity computed.
Subject
Hardware and Architecture,Software
Cited by
10 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献