Author:
Askari Arian,Abolghasemi Amin,Pasi Gabriella,Kraaij Wessel,Verberne Suzan
Abstract
AbstractIn this paper we propose a novel approach for combining first-stage lexical retrieval models and Transformer-based re-rankers: we inject the relevance score of the lexical model as a token into the input of the cross-encoder re-ranker. It was shown in prior work that interpolation between the relevance score of lexical and Bidirectional Encoder Representations from Transformers (BERT) based re-rankers may not consistently result in higher effectiveness. Our idea is motivated by the finding that BERT models can capture numeric information. We compare several representations of the Best Match 25 (BM25) and Dense Passage Retrieval (DPR) scores and inject them as text in the input of four different cross-encoders. Since knowledge distillation, i.e., teacher-student training, proved to be highly effective for cross-encoder re-rankers, we additionally analyze the effect of injecting the relevance score into the student model while training the model by three larger teacher models. Evaluation on the MSMARCO Passage collection and the TREC DL collections shows that the proposed method significantly improves over all cross-encoder re-rankers as well as the common interpolation methods. We show that the improvement is consistent for all query types. We also find an improvement in exact matching capabilities over both the first-stage rankers and the cross-encoders. Our findings indicate that cross-encoder re-rankers can efficiently be improved without additional computational burden or extra steps in the pipeline by adding the output of the first-stage ranker to the model input. This effect is robust for different models and query types.
Funder
EU Horizon 2020 ITN/ETN on Domain Specific Systems for Information Extraction and Retrieval
Publisher
Springer Science and Business Media LLC
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献