Affiliation:
1. Imperial College London, London, UK
Abstract
We present an independent empirical study on concurrency testing using controlled schedulers. We have gathered 49 buggy concurrent software benchmarks, drawn from public code bases, which we call
SCTBench
. We applied a modified version of an existing concurrency testing tool to SCTBench, testing five controlled scheduling techniques: depth-first search, preemption bounding, delay bounding, a controlled random scheduler, and probabilistic concurrency testing (PCT). We attempt to answer several research questions: Which technique performs the best, in terms of bug finding ability? How effective are the two main schedule bounding techniques—preemption bounding and delay bounding—at finding bugs? What challenges are associated with applying concurrency testing techniques to existing code? Can we classify certain benchmarks as trivial or nontrivial? Overall, we found that PCT (with parameter
d
= 3) was the most effective technique in terms of bug finding; it found all bugs found by the other techniques, plus an additional three, and it missed only one bug. Surprisingly, we found that the naive controlled random scheduler, which randomly chooses one thread to execute at each scheduling point, performs well, finding more bugs than preemption bounding and just two fewer bugs than delay bounding. Our findings confirm that delay bounding is superior to preemption bounding and that schedule bounding is superior to an unbounded depth-first search. The majority of bugs in SCTBench can be exposed using a small schedule bound (1--2), supporting previous claims, although one benchmark requires five preemptions. We found that the need to remove nondeterminism and control all synchronization (as is required for
systematic
concurrency testing) can be nontrivial. There were eight distinct programs that could not easily be included in out study, such as those that perform network and interprocess communication. We report various properties about the benchmarks tested, such as the fact that the bugs in 18 benchmarks were exposed 50% of the time when using random scheduling. We note that future work should not use the benchmarks that we classify as trivial when presenting new techniques, other than as a minimum baseline. We have made SCTBench and our tools publicly available for reproducibility and use in future work.
Publisher
Association for Computing Machinery (ACM)
Subject
Computational Theory and Mathematics,Computer Science Applications,Hardware and Architecture,Modeling and Simulation,Software
Reference47 articles.
1. Context-Bounded Analysis of TSO Systems
2. Producing scheduling that causes concurrent programs to fail
3. A few billion lines of code later
4. Christian Bienia. 2011. Benchmarking Modern Multiprocessors. Ph.D. Dissertation. Princeton University Princeton NJ. Christian Bienia. 2011. Benchmarking Modern Multiprocessors. Ph.D. Dissertation. Princeton University Princeton NJ.
Cited by
19 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献