Approximate Matching Between XML Documents and Schemas with Applications in XML Classification and Clustering

Author:

Xing Guangming1

Affiliation:

1. Western Kentucky University, USA

Abstract

Classification/clustering of XML documents based on their structural information is important for many tasks related with document management. In this chapter, we present a suite of algorithms to compute the cost for approximate matching between XML documents and schemas. A framework for classifying/clustering XML documents by structure is then presented based on the computation of distances between XML documents and schemas. The backbone of the framework is the feature representation using a vector of the distances. Experimental studies were conducted on various XML data sets, suggesting the efficiency and effectiveness of our approach as a solution for structural classification/clustering of XML documents.

Publisher

IGI Global

Reference47 articles.

1. Abiteboul, S. (1997). Querying semi-structured data. In F.N. Afrati, & P. G. Kolaitis (Eds.), Proceedings of the Database Theory - 6th International Conference (ICDT) Lecture Notes in Computer Science: Vol. 1186 (pp. 1-18). Springer.

2. Protection and administration of XML data sources

3. A matching algorithm for measuring the structural similarity between an XML document and a DTD and its applications

4. Measuring the structural similarity among XML documents and DTDs

5. Bray, T., Paoli, J., Sperberg-McQueen, C. M., Maler, E., & Yergeau, F. (2004). Extensible markup language (XML) 1.0 (3rd ed). Retrieved from http://www.w3.org/TR/2004/REC-xml-20040204/

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3