Affiliation:
1. Shanghai Jiao Tong University, Shanghai, China
Abstract
Learning to rank has become increasingly important for many information retrieval applications. To reduce the labeling cost at training data preparation, many active sampling algorithms have been proposed. In this article, we propose a novel active learning-for-ranking strategy called ranking-based sensitivity sampling (RSS), which is tailored for Gradient Boosting Decision Tree (GBDT), a machine-learned ranking method widely used in practice by major commercial search engines for ranking. We leverage the property of GBDT that samples close to the decision boundary tend to be sensitive to perturbations and design the active learning strategy accordingly. We further theoretically analyze the proposed strategy by exploring the connection between the sensitivity used for sample selection and model regularization to provide a potentially theoretical guarantee w.r.t. the generalization capability. Considering that the performance metrics of ranking overweight the top-ranked items, item rank is incorporated into the selection function. In addition, we generalize the proposed technique to several other base learners to show its potential applicability in a wide variety of applications. Substantial experimental results on both the benchmark dataset and a real-world dataset have demonstrated that our proposed active learning strategy is highly effective in selecting the most informative examples.
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Networks and Communications
Reference39 articles.
1. N. Ailon. 2011. Active learning ranking from pairwise preferences with almost optimal query complexity. In Advances in Neural Information Processing Systems (NIPS’11). 810--818. N. Ailon. 2011. Active learning ranking from pairwise preferences with almost optimal query complexity. In Advances in Neural Information Processing Systems (NIPS’11). 810--818.
2. Document selection methodologies for efficient and effective learning-to-rank
3. Active query selection for learning rankers
4. Training with Noise is Equivalent to Tikhonov Regularization
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Clustering-Based Transductive Semi-Supervised Learning for Learning-to-Rank;International Journal of Pattern Recognition and Artificial Intelligence;2019-11
2. Dynamic Information Retrieval Modeling;Synthesis Lectures on Information Concepts, Retrieval, and Services;2016-06-15