Top-K data source selection for keyword queries over multiple XML data sources-Reference-Cited by-同舟云学术

Top-K data source selection for keyword queries over multiple XML data sources

Published:2012-03-05 Issue:2 Volume:38 Page:156-175
ISSN:0165-5515
Container-title:Journal of Information Science
language:en
Short-container-title:Journal of Information Science

Author:

Nguyen Khanh¹,Cao Jinli¹

Affiliation:

1. La Trobe University, Australia

Abstract

With the proliferation of XML data, searching XML data using keyword queries has attracted much attention. However, most of the current approaches focus on keyword-based searches over a single XML document. Searching over a system integrating hundreds or even thousands of data sources by sequentially querying every single source is extremely costly, and thus may be impractical. In this article we propose a novel approach for selecting the top-K data sources by relying on their relevance to a given query, to avoid the high cost of searching in numerous, potentially irrelevant data sources. Our approach summarizes the data sources as succinct synopses for the rapid filtering of non-promising sources. We maintain both structural and value distribution information of each data source, and propose a novel ranking function to measure effectively the relevance of the data source to the given query. We conducted experiments with real datasets, and results show that our approach achieves high performances in all evaluation metrics: recall, precision and Spearman’s rank correlation coefficient with different experimental parameters.

Publisher

SAGE Publications

Subject

Library and Information Sciences,Information Systems

Link

http://journals.sagepub.com/doi/pdf/10.1177/0165551511435875

Reference32 articles.

1. Keyword proximity search in XML trees

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Data Source Importance Evaluation for Highway Networks: A Complex Network-Based Approach;Promet - Traffic&Transportation;2024-08-27

2. Efficient keyword search over graph-structured data based on minimal covered r-cliques;Frontiers of Information Technology & Electronic Engineering;2020-03

3. Towards improving XML search by using structure clustering technique;Journal of Information Science;2014-12-12

4. A query transformation framework for automated structured query construction in structured retrieval environment;Journal of Information Science;2014-01-24