An Overviewof Similarity Measures for Clustering XML Documents-Reference-Cited by-同舟云学术

An Overviewof Similarity Measures for Clustering XML Documents

Published:2007 Issue: Volume: Page:56-78
ISSN:
Container-title:Web Data Management Practices
language:
Short-container-title:

Author:

Guerrini Giovanna¹,Mesiti Marco²,Sanz Ismael³

Affiliation:

1. Universita degli Studi di Genova, Italy

2. Universita degli Studi di Milano, Italy

3. Universitat Jaume I, Spain

Abstract

The large amount and heterogeneity of XML documents on the Web require the development of clustering techniques to group together similar documents. Documents can be grouped together according to their content, their structure, and links inside and among documents. For instance, grouping together documents with similar structures has interesting applications in the context of information extraction, of heterogeneous data integration, of personalized content delivery, of access control definition, of web site structural analysis, of comparison of RNA secondary structures. Many approaches have been proposed for evaluating the structural and content similarity between tree-based and vector-based representations of XML documents. Link-based similarity approaches developed for Web data clustering have been adapted for XML documents. This chapter discusses and compares the most relevant similarity measures and their employment for XML document clustering.

Publisher

IGI Global

Cited by 9 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A review on semantic similarity measures for ontology;Journal of Intelligent & Fuzzy Systems;2019-04-10

2. Approximating the Schema of a Set of Documents by Means of Resemblance;Journal on Data Semantics;2018-06

3. Automatic Clustering of Research Articles Using Domain Ontology and Fuzzy Logic;Lecture Notes in Computer Science;2015

4. XML clustering: a review of structural approaches;The Knowledge Engineering Review;2014-10-29

5. Approximate XML Query Processing;Advanced Query Processing;2013