OrcFS

Author:

Yoo Jinsoo1,Oh Joontaek1,Lee Seongjin2,Won Youjip1,Ha Jin-Yong3,Lee Jongsung3,Shim Junseok3

Affiliation:

1. Hanyang University

2. Gyeongsang National University

3. Samsung Electronics

Abstract

In this work, we develop the Orchestrated File System (OrcFS) for Flash storage. OrcFS vertically integrates the log-structured file system and the Flash-based storage device to eliminate the redundancies across the layers. A few modern file systems adopt sophisticated append-only data structures in an effort to optimize the behavior of the file system with respect to the append-only nature of the Flash memory. While the benefit of adopting an append-only data structure seems fairly promising, it makes the stack of software layers full of unnecessary redundancies, leaving substantial room for improvement. The redundancies include (i) redundant levels of indirection (address translation), (ii) duplicate efforts to reclaim the invalid blocks (i.e., segment cleaning in the file system and garbage collection in the storage device), and (iii) excessive over-provisioning (i.e., separate over-provisioning areas in each layer). OrcFS eliminates these redundancies via distributing the address translation, segment cleaning (or garbage collection), bad block management, and wear-leveling across the layers. Existing solutions suffer from high segment cleaning overhead and cause significant write amplification due to mismatch between the file system block size and the Flash page size. To optimize the I/O stack while avoiding these problems, OrcFS adopts three key technical elements. First, OrcFS uses disaggregate mapping , whereby it partitions the Flash storage into two areas, managed by a file system and Flash storage, respectively, with different granularity. In OrcFS, the metadata area and data area are maintained by 4Kbyte page granularity and 256Mbyte superblock granularity. The superblock-based storage management aligns the file system section size, which is a unit of segment cleaning, with the superblock size of the underlying Flash storage. It can fully exploit the internal parallelism of the underlying Flash storage, exploiting the sequential workload characteristics of the log-structured file system. Second, OrcFS adopts quasi-preemptive segment cleaning to prohibit the foreground I/O operation from being interfered with by segment cleaning. The latency to reclaim the free space can be prohibitive in OrcFS due to its large file system section size, 256Mbyte. OrcFS effectively addresses this issue via adopting a polling-based segment cleaning scheme. Third, the OrcFS introduces block patching to avoid unnecessary write amplification in the partial page program. OrcFS is the enhancement of the F2FS file system. We develop a prototype OrcFS based on F2FS and server class SSD with modified firmware (Samsung 843TN). OrcFS reduces the device mapping table requirement to 1/465 and 1/4 compared with the page mapping and the smallest mapping scheme known to the public, respectively. Via eliminating the redundancy in the segment cleaning and garbage collection, the OrcFS reduces 1/3 of the write volume under heavy random write workload. OrcFS achieves 56% performance gain against EXT4 in varmail workload.

Funder

Ministry of ScienceICT8Future Plannin

ICT R8D program of MSIP/IITP

NRF

Ministry of Science ICT8Future Planning under the ITRC support program

BK21 plus program

Ministry of Education of Korea

Basic Research Lab Program

Publisher

Association for Computing Machinery (ACM)

Subject

Hardware and Architecture

Reference62 articles.

Cited by 6 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. CoDiscard: A revenue model based cross-layer cooperative discarding mechanism for flash memory devices;Journal of Systems Architecture;2022-07

2. ELOFS: An Extensible Low-overhead Flash File System for Resource-scarce Embedded Devices;IEEE Transactions on Computers;2022

3. SaS;Proceedings of the VLDB Endowment;2021-05

4. Two-tier garbage collection for persistent object;Proceedings of the 35th Annual ACM Symposium on Applied Computing;2020-03-29

5. A Highly Reliable Metadata Service for Large-Scale Distributed File Systems;IEEE Transactions on Parallel and Distributed Systems;2020-02-01

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3