Relevance Assessments for Web Search Evaluation: Should We Randomise or Prioritise the Pooled Documents?

Author:

Sakai Tetsuya1ORCID,Tao Sijie1ORCID,Zeng Zhaohao1ORCID

Affiliation:

1. Waseda University, Okubo, Shinjuku, Tokyo, Japan

Abstract

In the context of depth- k pooling for constructing web search test collections, we compare two approaches to ordering pooled documents for relevance assessors: The prioritisation strategy (PRI) used widely at NTCIR, and the simple randomisation strategy (RND). In order to address research questions regarding PRI and RND, we have constructed and released the WWW3E8 dataset, which contains eight independent relevance labels for 32,375 topic-document pairs, i.e., a total of 259,000 labels. Four of the eight relevance labels were obtained from PRI-based pools; the other four were obtained from RND-based pools. Using WWW3E8, we compare PRI and RND in terms of inter-assessor agreement, system ranking agreement, and robustness to new systems that did not contribute to the pools. We also utilise an assessor activity log we obtained as a byproduct of WWW3E8 to compare the two strategies in terms of assessment efficiency. Our main findings are: (a) The presentation order has no substantial impact on assessment efficiency; (b) While the presentation order substantially affects which documents are judged (highly) relevant, the difference between the inter-assessor agreement under the PRI condition and that under the RND condition is of no practical significance; (c) Different system rankings under the PRI condition are substantially more similar to one another than those under the RND condition; and (d) PRI-based relevance assessment files (qrels) are substantially and statistically significantly more robust to new systems than RND-based ones. Finding (d) suggests that PRI helps the assessors identify relevant documents that affect the evaluation of many existing systems, including those that did not contribute to the pools. Hence, if researchers need to evaluate their current IR systems using legacy IR test collections, we recommend the use of those constructed using the PRI approach unless they have a good reason to believe that their systems retrieve relevant documents that are vastly different from the pooled documents. While this robustness of PRI may also mean that the PRI-based pools are biased against future systems that retrieve highly novel relevant documents, one should note that there is no evidence that RND is any better in this respect.

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Science Applications,General Business, Management and Accounting,Information Systems

Reference47 articles.

1. James Allan Ben Carterette Javed A. Aslam Virgil Pavlu Blagovest Dachev and Evangelos Kanoulas. 2008. Million query track 2007 overview.

2. James Allan, Donna Harman, Evangelos Kanoulas, Dan Li, Christophe Van Gysel, and Ellen Voorhees. 2018. TREC common core track overview. In Proceedings of the TREC 2017.

3. Relevance assessment

4. Ben Carterette, Virgil Pavlu, Hui Fang, and Evangelos Kanoulas. 2010. Million query track 2009 overview. In Proceedings of the TREC 2009.

5. Differences in search engine evaluations between query owners and non-owners

Cited by 4 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Dual cycle generative adversarial networks for web search;Applied Soft Computing;2024-03

2. On the Ordering of Pooled Web Pages, Gold Assessments, and Bronze Assessments;ACM Transactions on Information Systems;2023-08-21

3. How Many Crowd Workers Do I Need? On Statistical Power when Crowdsourcing Relevance Judgments;ACM Transactions on Information Systems;2023-08-18

4. Examining User Heterogeneity in Digital Experiments;ACM Transactions on Information Systems;2023-01-12

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3