Affiliation:
1. Department of Computer Sciences, The University of Texas at Austin, Austin, Texas
2. Department of Computer Science, University of Massachusetts, Amherst, Massachusetts
Abstract
Parallel, multithreaded C and C++ programs such as web servers, database managers, news servers, and scientific applications are becoming increasingly prevalent. For these applications, the memory allocator is often a bottleneck that severely limits program performance and scalability on multiprocessor systems. Previous allocators suffer from problems that include poor performance and scalability, and heap organizations that introduce false sharing. Worse, many allocators exhibit a dramatic increase in memory consumption when confronted with a producer-consumer pattern of object allocation and freeing. This increase in memory consumption can range from a factor of
P
(the number of processors) to unbounded memory consumption.This paper introduces Hoard, a fast, highly scalable allocator that largely avoids false sharing and is memory efficient. Hoard is the first allocator to simultaneously solve the above problems. Hoard combines one global heap and per-processor heaps with a novel discipline that provably bounds memory consumption and has very low synchronization costs in the common case. Our results on eleven programs demonstrate that Hoard yields low average fragmentation and improves overall program performance over the standard Solaris allocator by up to a factor of 60 on 14 processors, and up to a factor of 18 over the next best allocator we tested.
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Graphics and Computer-Aided Design,Software
Reference28 articles.
1. U. Acar E. Berger R. Blumofe and D. Papadopoulos. Hood: A threads library for multiprogrammed multiprocessors. http://www.cs.utexas.edu/users/hood Sept. 1999. U. Acar E. Berger R. Blumofe and D. Papadopoulos. Hood: A threads library for multiprogrammed multiprocessors. http://www.cs.utexas.edu/users/hood Sept. 1999.
2. A hierarchical O(N log N) force-calculation algorithm
3. bCandid.com Inc. http://www.bcandid.com. bCandid.com Inc. http://www.bcandid.com.
4. Scheduling multithreaded computations by work stealing
Cited by
63 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. PMAlloc: A Holistic Approach to Improving Persistent Memory Allocation;ACM Transactions on Computer Systems;2024-02-03
2. OOM-Guard: Towards Improving the Ergonomics of Rust OOM Handling via a Reservation-Based Approach;Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering;2023-11-30
3. PMA: A Persistent Memory Allocator with High Efficiency and Crash Consistency Guarantee;2023 IEEE 41st International Conference on Computer Design (ICCD);2023-11-06
4. Partial Failure Resilient Memory Management System for (CXL-based) Distributed Shared Memory;Proceedings of the 29th Symposium on Operating Systems Principles;2023-10-23
5. Beyond RSS: Towards Intelligent Dynamic Memory Management (Work in Progress);Proceedings of the 20th ACM SIGPLAN International Conference on Managed Programming Languages and Runtimes;2023-10-19