Reliable benchmarking: requirements and solutions-Reference-Cited by-同舟云学术

Reliable benchmarking: requirements and solutions

Published:2017-11-03 Issue:1 Volume:21 Page:1-29
ISSN:1433-2779
Container-title:International Journal on Software Tools for Technology Transfer
language:en
Short-container-title:Int J Softw Tools Technol Transfer

Author:

Beyer Dirk,Löwe Stefan,Wendler Philipp

Abstract

AbstractBenchmarking is a widely used method in experimental computer science, in particular, for the comparative evaluation of tools and algorithms. As a consequence, a number of questions need to be answered in order to ensure proper benchmarking, resource measurement, and presentation of results, all of which is essential for researchers, tool developers, and users, as well as for tool competitions. We identify a set of requirements that are indispensable for reliable benchmarking and resource measurement of time and memory usage of automatic solvers, verifiers, and similar tools, and discuss limitations of existing methods and benchmarking tools. Fulfilling these requirements in a benchmarking framework can (on Linux systems) currently only be done by using the cgroup and namespace features of the kernel. We developed BenchExec, a ready-to-use, tool-independent, and open-source implementation of a benchmarking framework that fulfills all presented requirements, making reliable benchmarking and resource measurement easy. Our framework is able to work with a wide range of different tools, has proven its reliability and usefulness in the International Competition on Software Verification, and is used by several research groups worldwide to ensure reliable benchmarking. Finally, we present guidelines on how to present measurement results in a scientifically valid and comprehensible way.

Publisher

Springer Science and Business Media LLC

Subject

Information Systems,Software

Link

http://link.springer.com/article/10.1007/s10009-017-0469-y/fulltext.html

Reference33 articles.

1. Balyo, T., Heule, M.J.H., Järvisalo, M.: SAT competition 2016: recent developments. In: Proceedings of AAAI Conference on Artificial Intelligence, pp. 5061–5063. AAAI Press (2017)

2. Barrett, C., Fontaine, P., Tinelli, C.: The SMT-LIB standard: version 2.5. Technical report, University of Iowa (2015). www.smt-lib.org

3. Beyer, D.: Competition on software verification (SV-COMP). In: Proceedings of TACAS, LNCS 7214, pp. 504–524. Springer (2012)

4. Beyer, D.: Second competition on software verification (Summary of SV-COMP 2013). In: Proceedings of TACAS, LNCS 7795, pp. 594–609. Springer (2013)

5. Beyer, D.: Software verification and verifiable witnesses (Report on SV-COMP 2015). In: Proceedings of TACAS, LNCS 9035, pp. 401–416. Springer (2015)

Cited by 114 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Parallel program analysis on path ranges;Science of Computer Programming;2024-12

2. FM-Weck: Containerized Execution of Formal-Methods Tools;Lecture Notes in Computer Science;2024-09-13

3. Exploring Loose Coupling of Slicing with Dynamic Symbolic Execution on the JVM;Lecture Notes in Computer Science;2024-09-10

4. Refining CEGAR-Based Test-Case Generation with Feasibility Annotations;Lecture Notes in Computer Science;2024-09-10

5. Fast procedure to compute empirical and Bernstein copulas;Applied Mathematics and Computation;2024-09