BlueDBM-Reference-Cited by-同舟云学术

BlueDBM

Published:2016-09-17 Issue:3 Volume:34 Page:1-31
ISSN:0734-2071
Container-title:ACM Transactions on Computer Systems
language:en
Short-container-title:ACM Trans. Comput. Syst.

Author:

Jun Sang-Woo¹,Liu Ming¹,Lee Sungjin¹,Hicks Jamey²,Ankcorn John²,King Myron²,Xu Shuotao¹,Arvind ¹

Affiliation:

1. Massachusetts Institute of Technology

2. Quanta Research Cambridge

Abstract

Complex data queries, because of their need for random accesses, have proven to be slow unless all the data can be accommodated in DRAM. There are many domains, such as genomics, geological data, and daily Twitter feeds, where the datasets of interest are 5TB to 20TB. For such a dataset, one would need a cluster with 100 servers, each with 128GB to 256GB of DRAM, to accommodate all the data in DRAM. On the other hand, such datasets could be stored easily in the flash memory of a rack-sized cluster. Flash storage has much better random access performance than hard disks, which makes it desirable for analytics workloads. However, currently available off-the-shelf flash storage packaged as SSDs does not make effective use of flash storage because it incurs a great amount of additional overhead during flash device management and network access. In this article, we present BlueDBM, a new system architecture that has flash-based storage with in-store processing capability and a low-latency high-throughput intercontroller network between storage devices. We show that BlueDBM outperforms a flash-based system without these features by a factor of 10 for some important applications. While the performance of a DRAM-centric system falls sharply even if only 5% to 10% of the references are to secondary storage, this sharp performance degradation is not an issue in BlueDBM. BlueDBM presents an attractive point in the cost/performance tradeoff for Big Data analytics.

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Link

https://dl.acm.org/doi/pdf/10.1145/2898996

Reference55 articles.

1. Anurag Acharya Mustafa Uysal and Joel Saltz. 1998. Active Disks. Technical Report. Santa Barbara CA. Anurag Acharya Mustafa Uysal and Joel Saltz. 1998. Active Disks. Technical Report. Santa Barbara CA.

2. Infiniband Trade Association. 2014 (accessed November 18 2014). Infiniband. http://www.infinibandta.org. Infiniband Trade Association. 2014 (accessed November 18 2014). Infiniband. http://www.infinibandta.org.

3. DBC—A Database Computer for Very Large Databases

Cited by 17 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. MegIS: High-Performance, Energy-Efficient, and Low-Cost Metagenomic Analysis with In-Storage Processing;2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA);2024-06-29

2. ISort: SSD Internal Sorting Algorithm for Big Data;Mobile Information Systems;2022-12-15

3. Horae: A Hybrid I/O Request Scheduling Technique for Near-Data Processing-Based SSD;IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems;2022-11

4. PASM: Parallelism Aware Space Management strategy for hybrid SSD towards in-storage DNN training acceleration;Journal of Systems Architecture;2022-07

5. GenStore: a high-performance in-storage processing system for genome sequence analysis;Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems;2022-02-22