Affiliation:
1. Department of Computer Sciences, The University of Texas at Austin, Austin, Texas
2. Department of Computer Science, University of Massachusetts, Amherst, Massachusetts
Abstract
Parallel, multithreaded C and C++ programs such as web servers, database managers, news servers, and scientific applications are becoming increasingly prevalent. For these applications, the memory allocator is often a bottleneck that severely limits program performance and scalability on multiprocessor systems. Previous allocators suffer from problems that include poor performance and scalability, and heap organizations that introduce false sharing. Worse, many allocators exhibit a dramatic increase in memory consumption when confronted with a producer-consumer pattern of object allocation and freeing. This increase in memory consumption can range from a factor of
P
(the number of processors) to unbounded memory consumption.This paper introduces Hoard, a fast, highly scalable allocator that largely avoids false sharing and is memory efficient. Hoard is the first allocator to simultaneously solve the above problems. Hoard combines one global heap and per-processor heaps with a novel discipline that provably bounds memory consumption and has very low synchronization costs in the common case. Our results on eleven programs demonstrate that Hoard yields low average fragmentation and improves overall program performance over the standard Solaris allocator by up to a factor of 60 on 14 processors, and up to a factor of 18 over the next best allocator we tested.
Publisher
Association for Computing Machinery (ACM)
Reference28 articles.
1. U. Acar E. Berger R. Blumofe and D. Papadopoulos. Hood: A threads library for multiprogrammed multiprocessors. http://www.cs.utexas.edu/users/hood Sept. 1999. U. Acar E. Berger R. Blumofe and D. Papadopoulos. Hood: A threads library for multiprogrammed multiprocessors. http://www.cs.utexas.edu/users/hood Sept. 1999.
2. A hierarchical O(N log N) force-calculation algorithm
3. bCandid.com Inc. http://www.bcandid.com. bCandid.com Inc. http://www.bcandid.com.
4. Scheduling multithreaded computations by work stealing
Cited by
14 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. VCMalloc: A Virtually Contiguous Memory Allocator;IEEE Transactions on Computers;2023-12
2. Application of Thread-Local Garbage Collection to Distributed Systems for Large-Scale Data Processing;The Herald of the Siberian State University of Telecommunications and Informatics;2022-05-13
3. The Demikernel Datapath OS Architecture for Microsecond-scale Datacenter Systems;Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles CD-ROM;2021-10-26
4. Judging a type by its pointer: optimizing GPU virtual functions;Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems;2021-04-17
5. Data Races and the Discrete Resource-time Tradeoff Problem with Resource Reuse over Paths;The 31st ACM Symposium on Parallelism in Algorithms and Architectures;2019-06-17