Understanding and Combating Memory Bloat in Managed Data-Intensive Systems-Reference-Cited by-同舟云学术

Understanding and Combating Memory Bloat in Managed Data-Intensive Systems

Published:2018-02-23 Issue:4 Volume:26 Page:1-41
ISSN:1049-331X
Container-title:ACM Transactions on Software Engineering and Methodology
language:en
Short-container-title:ACM Trans. Softw. Eng. Methodol.

Author:

Nguyen Khanh¹,Wang Kai¹,Bu Yingyi¹,Fang Lu¹,Xu Guoqing¹

Affiliation:

1. University of California, Irvine, CA

Abstract

The past decade has witnessed increasing demands on data-driven business intelligence that led to the proliferation of data-intensive applications. A managed object-oriented programming language such as Java is often the developer’s choice for implementing such applications, due to its quick development cycle and rich suite of libraries and frameworks. While the use of such languages makes programming easier, their automated memory management comes at a cost. When the managed runtime meets large volumes of input data, memory bloat is significantly magnified and becomes a scalability-prohibiting bottleneck. This article first studies, analytically and empirically, the impact of bloat on the performance and scalability of large-scale, real-world data-intensive systems. To combat bloat, we design a novel compiler framework, called F acade , that can generate highly efficient data manipulation code by automatically transforming the data path of an existing data-intensive application. The key treatment is that in the generated code, the number of runtime heap objects created for data classes in each thread is (almost) statically bounded , leading to significantly reduced memory management cost and improved scalability. We have implemented F acade and used it to transform seven common applications on three real-world, already well-optimized data processing frameworks: GraphChi, Hyracks, and GPS. Our experimental results are very positive: the generated programs have (1) achieved a 3% to 48% execution time reduction and an up to 88× GC time reduction, (2) consumed up to 50% less memory, and (3) scaled to much larger datasets.

Funder

Office of Naval Research

National Science Foundation

Publisher

Association for Computing Machinery (ACM)

Subject

Software

Link

https://dl.acm.org/doi/pdf/10.1145/3162626

Reference123 articles.

1. Optimizing joins in a map-reduce environment

2. Scheduling shared scans of large data files

3. Better static memory management

4. Performance analysis of idle programs

5. Apache 2014a. Apache Flink. Retrieved from http://flink.apache.org/. Apache 2014a. Apache Flink. Retrieved from http://flink.apache.org/.

Cited by 8 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. CLOUD-QM: a quality model for benchmarking cloud-based enterprise information systems;Software Quality Journal;2024-05-14

2. Towards Speedy Permission-Based Debloating for Android Apps;Proceedings of the IEEE/ACM 11th International Conference on Mobile Software Engineering and Systems;2024-04-14

3. Coverage-Based Debloating for Java Bytecode;ACM Transactions on Software Engineering and Methodology;2023-04-04

4. Prioritising test scripts for the testing of memory bloat in web applications;IET Software;2022-03-03

5. XDebloat: Towards Automated Feature-Oriented App Debloating;IEEE Transactions on Software Engineering;2021