Affiliation:
1. University of Wisconsin, Madison, WI
2. Microsoft Research, Redmond, WA
Abstract
For five years, we collected annual snapshots of file-system metadata from over 60,000 Windows PC file systems in a large corporation. In this article, we use these snapshots to study temporal changes in file size, file age, file-type frequency, directory size, namespace structure, file-system population, storage capacity and consumption, and degree of file modification. We present a generative model that explains the namespace structure and the distribution of directory sizes. We find significant temporal trends relating to the popularity of certain file types, the origin of file content, the way the namespace is used, and the degree of variation among file systems, as well as more pedestrian changes in size and capacities. We give examples of consequent lessons for designers of file systems and related software.
Publisher
Association for Computing Machinery (ACM)
Subject
Hardware and Architecture
Cited by
111 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. An empirical study of challenges in machine learning asset management;Empirical Software Engineering;2024-06-15
2. Exploiting Flat Namespace to Improve File System Metadata Performance on Ultra-Fast, Byte-Addressable NVMs;ACM Transactions on Storage;2024-01-30
3. DHIFS: A Dynamic and Hybrid Index Method with Low Memory Overhead and Efficient File Access;2023 IEEE International Conference on High Performance Computing & Communications, Data Science & Systems, Smart City & Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys);2023-12-17
4. Low-Latency and Scalable Full-path Indexing Metadata Service for Distributed File Systems;2023 IEEE 41st International Conference on Computer Design (ICCD);2023-11-06
5. The State of the Art of Metadata Managements in Large-Scale Distributed File Systems — Scalability, Performance and Availability;IEEE Transactions on Parallel and Distributed Systems;2022-12-01