Exploiting hierarchical domain structure to compute similarity-Reference-Cited by-同舟云学术

Exploiting hierarchical domain structure to compute similarity

Published:2003-01 Issue:1 Volume:21 Page:64-93
ISSN:1046-8188
Container-title:ACM Transactions on Information Systems
language:en
Short-container-title:ACM Trans. Inf. Syst.

Author:

Ganesan Prasanna¹,Garcia-Molina Hector¹,Widom Jennifer¹

Affiliation:

1. Stanford University, Stanford, CA

Abstract

The notion of similarity between objects finds use in many contexts, for example, in search engines, collaborative filtering, and clustering. Objects being compared often are modeled as sets, with their similarity traditionally determined based on set intersection. Intersection-based measures do not accurately capture similarity in certain domains, such as when the data is sparse or when there are known relationships between items within sets. We propose new measures that exploit a hierarchical domain structure in order to produce more intuitive similarity scores. We extend our similarity measures to provide appropriate results in the presence of multisets (also handled unsatisfactorily by traditional measures), for example, to correctly compute the similarity between customers who buy several instances of the same product (say milk), or who buy several products in the same category (say dairy products). We also provide an experimental comparison of our measures against traditional similarity measures, and report on a user study that evaluated how well our measures match human intuition.

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Science Applications,General Business, Management and Accounting,Information Systems

Link

https://dl.acm.org/doi/pdf/10.1145/635484.635487

Reference42 articles.

1. Approximate query processing using wavelets;Chakrabarti K.;Proceedings of VLDB,2000

Cited by 182 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. XML CLUSTERING FRAMEWORK BASED ON DOCUMENT CONTENT AND STRUCTURE IN A HETEROGENEOUS DIGITAL LIBRARY;Malaysian Journal of Computer Science;2023-04-30

2. PLRec: An Efficient Approach Towards E-Learning Recommendation Using LSTM-CNN Technique;Proceedings of the International Conference on Cognitive and Intelligent Computing;2023

3. Research on Digital Curriculum Collaborative Filtering Technology based on Feature Group;2022 International Symposium on Advances in Informatics, Electronics and Education (ISAIEE);2022-12

4. Multistage Cloud-Service Matching and Optimization Based on Hierarchical Decomposition of Design Tasks;Machines;2022-09-06

5. Scaling High-Quality Pairwise Link-Based Similarity Retrieval on Billion-Edge Graphs;ACM Transactions on Information Systems;2022-01-11