Evaluation Infrastructures for Academic Shared Tasks-Reference-Cited by-同舟云学术

Evaluation Infrastructures for Academic Shared Tasks

Published:2020-02-07 Issue:1 Volume:20 Page:29-36
ISSN:1618-2162
Container-title:Datenbank-Spektrum
language:en
Short-container-title:Datenbank Spektrum

Author:

Schaible Johann^ORCID,Breuer Timo,Tavakolpoursaleh Narges,Müller Bernd,Wolff Benjamin,Schaer Philipp

Abstract

AbstractAcademic search systems aid users in finding information covering specific topics of scientific interest and have evolved from early catalog-based library systems to modern web-scale systems. However, evaluating the performance of the underlying retrieval approaches remains a challenge. An increasing amount of requirements for producing accurate retrieval results have to be considered, e.g., close integration of the system’s users. Due to these requirements, small to mid-size academic search systems cannot evaluate their retrieval system in-house. Evaluation infrastructures for shared tasks alleviate this situation. They allow researchers to experiment with retrieval approaches in specific search and recommendation scenarios without building their own infrastructure. In this paper, we elaborate on the benefits and shortcomings of four state-of-the-art evaluation infrastructures on search and recommendation tasks concerning the following requirements: support for online and offline evaluations, domain specificity of shared tasks, and reproducibility of experiments and results. In addition, we introduce an evaluation infrastructure concept design aiming at reducing the shortcomings in shared tasks for search and recommender systems.

Funder

GESIS – Leibniz-Institut für Sozialwissenschaften e.V.

Publisher

Springer Science and Business Media LLC

Subject

General Earth and Planetary Sciences,General Environmental Science

Link

http://link.springer.com/content/pdf/10.1007/s13222-020-00335-x.pdf

Reference27 articles.

1. Balog K, Kelly L, Schuth A (2014) Head first: living labs for ad-hoc search evaluation. Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management. ACM, New York

2. Balog K, Schuth A, Dekker P, Schaer P, Tavakolpoursaleh N, Chuang PY (2016) Overview of the trec 2016 open search track. Proceedings of the 25th Text REtrieval Conference 2016. Gaithersburg, NIST

3. Beel J, Genzmehr M, Langer S, Nürnberger A, Gipp B (2013) A comparative analysis of offline and online evaluations and discussion of research paper recommender system evaluation. Proceedings of the international workshop on reproducibility and replication in recommender systems evaluation. ACM, New York

4. Breuer T, Schaer P, Tavakolpoursaleh N, Schaible J, Wolff B, Müller B (2019) STELLA: towards a framework for the reproducibility of online search experiments. Proceedings of the Open-Source IR Replicability Challenge co-located with SIGIR, OSIRRC@SIGIR.

5. Brodt T, Hopfgartner F (2014) Shedding light on a living lab: The clef newsreel open recommendation platform. IIiX’14: Proceedings of the Information Interaction in Context Conference. ACM, New York

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Datenbankherstellerrecht und Datenbankforschung;Datenbank-Spektrum;2023-07

2. Evaluating Research Dataset Recommendations in a Living Lab;Lecture Notes in Computer Science;2022

3. Overview of LiLAS 2021 – Living Labs for Academic Search;Lecture Notes in Computer Science;2021

4. Overview of LiLAS 2020 – Living Labs for Academic Search;Lecture Notes in Computer Science;2020