Abstract
Federated search or distributed information retrieval routes the user's search query to multiple component collections and presents a merged result list in ranked order by comparing the relevance score of each returned result. However, the heterogeneity of the component collections makes it challenging for the central broker to compare these relevance scores while fusing the results into a single ranked list. To address this issue, most existing approaches merge the returned results by converting the document ranks to their ranking scores or downloading the documents and computing their relevance score. However, these approaches are not efficient enough, because the former methods suffer from limited efficacy of result merging due to the negligible number of overlapping documents and the latter are resource intensive. The current paper addresses this problem by proposing a new method that extracts features of both documents and component collections from the available information provided by the collections at query time. Each document and its collection features are exploited together to establish the document relevance score. The ant colony optimization is used for information retrieval to create a merged result list. The experimental results with the TREC 2013 FedWeb dataset demonstrate that the proposed method significantly outperforms the baseline approaches.
Publisher
Engineering, Technology & Applied Science Research