A Locality Sensitive Hashing Technique for Categorical Data-Reference-Cited by-同舟云学术

A Locality Sensitive Hashing Technique for Categorical Data

Published:2012-12 Issue: Volume:241-244 Page:3159-3164
ISSN:1662-7482
Container-title:Applied Mechanics and Materials
language:
Short-container-title:AMM

Author:

Lee Kyung Mi¹,Lee Keon Myung¹

Affiliation:

1. Chungbuk National University

Abstract

The measured data may contain various types of attributes such as continuous, categorical, and set-valued attributes. Several locality-sensitive hashing techniques, which enable to find similar pairs of data in a fast and approximate way, have been developed for data with either numeric or set-valued attributes. This paper introduces a new locality sensitive-hashing technique applicable to data with categorical attributes.

Publisher

Trans Tech Publications, Ltd.

Link

https://www.scientific.net/AMM.241-244.3159.pdf

Reference14 articles.

1. A. Rajaraman and J. D. Ullman: Mining of Massive Datasets, Cambridge University Press (2012).

2. S. Boriah, V. Chandola, V. Kumar: Similarity Measures for Categorical Data: A Comparative Evaluation, Proc. of the 8th SIAM Int. Conf. on Data Mining (2008) 243-254.

3. U. Manber: Finding similar files in a large file system, Proc. USENIX Conference (1994) 1–10.

4. A. Z. Broder: On the resemblance and containment of documents, Proc. Compression and Complexity of Sequence (1997) 21–29.

5. A. Z. Broder, M. Charikar, A. M. Frieze, and M. Mitzenmacher: Min-wise independent permutations, ACM Symposium on Theory of Computing (1998) 327–336.

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Density-based Algorithms for Big Data Clustering Using MapReduce Framework;ACM Computing Surveys;2020-10-15

2. Parallel Hierarchical Subspace Clustering of Categorical Data;IEEE Transactions on Computers;2019-04-01

3. Fast Fuzzy Search for Mixed Data Using Locality Sensitive Hashing;Applied Mechanics and Materials;2013-11