Abstract
Traditional relevance feedback technique could help improve retrieval performance. It usually utilize the most frequent terms in the relevant documents to enrich the user’s initial query. We re-examine this method and find that many expansion terms identified in traditional approaches are indeed unrelated to the query and harmful to the retrieval. This paper introduces a Support Vector Machines Based method to improve the retrieval results. Firstly, the classifier is trained on the feedback documents. Then, we can utilize this classifier to classify the rest of the documents and move relevant documents to the front of irrelevant documents. This new approach avoids modifying the initial query, so it’s a new direction for the relevance feedback techniques. Our Experiments on TREC dataset demonstrate that retrieval effectiveness can be improved more than 24.37% when our proposed approach is used.
Publisher
Trans Tech Publications, Ltd.
Reference11 articles.
1. AMO P, FERRERAS F L, CRUZ F, et al. Smoothing functions for automatic relevance feedback in information retrieval. Proc of the 11th International Workshop on Database and Expert Systems Applications. 2000: 115-119.
2. A. Spink, B. J. Jansen, D. Wolfram, and T. Saracevic. From e-sex to e-commerce: Web search changes. IEEE Computer, 35(3): 107-109, (2002).
3. Bishop, C. Patten recognition and machine learning. Berlin: Springer-Verlag, (2006).
4. G. Salton and M. J. McGill. Introduction to Modern Information Retrieval. New York: McGraw-Hill, (1983).
5. Guihong Cao, Jian-Yun Nie, Jianfeng Gao, Stephen Robertson. Selecting good expansion terms for pseudo-relevance feedback. Proc of ACM SIGIR Conference on Research and Development in Information Retrieval. (2008).