Affiliation:
1. Chungbuk National University and ITRC
Abstract
The drastic increase in data volume strongly demands efficient search techniques for similar data to queries. It is sometimes useful to specify data of interest with fuzzy constraints. When data objects contain both numerical and categorical attributes, it is usually not easy to define commonly-accepted distance measures between data objects. With no efficient indexing structure, it costs much to search for specific data objects because a linear search needs to be conducted over the whole data set. This paper proposes a method to use locality sensitive hashing technique and fuzzy constrained queries to search for interesting ones from big data. The method builds up a locality sensitive hashing-based indexing structure only with constituting continuous attributes, collects a small number of candidate data objects to which query is examined, and then evaluates their satisfaction degree to fuzzy constrained query so that data objects satisfying the query are determined.
Publisher
Trans Tech Publications, Ltd.
Reference11 articles.
1. H. -J. Zimmermann, Fuzzy Set Theory – and its Applications, (4th eds. ), Kluwer Academic Publishers (2001).
2. P. Indyk and R. Motwani, Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality, Proc. of STOC1998 (1998).
3. A. Gionis, P. Indyk, and R. Motwani, Similarity Search in High Dimensions via Hashing, Proc. of VLDB1999 (1999).
4. K. M. Lee, Locality sensitive hashing with extended partitioning boundaries, Applied Mechanics and Materials, 321-324 , pp.804-807 (2013).
5. K. M. Lee, Locality-sensitive Hashing Techniques for Nearest Neighbor Search, Int. J. of Fuzzy Logic and Intell. Syst., 12(4) (2012).