Affiliation:
1. Beihang University, Beijing, China
Abstract
Virtual machine (VM) snapshot enhances the system availability by saving the running state into stable storage during failure-free execution and rolling back to the snapshot point upon failures. Unfortunately, the snapshot state may be lost due to disk failures, so that the VM fails to be recovered. The popular distributed file systems employ replication technique to tolerate disk failures by placing redundant copies across disperse disks. However, unless user-specific personalization is provided, these systems consider the data in the file as of same importance and create identical copies of the entire file, leading to non-trivial additional storage overhead.
This paper proposes a page-aware replication system (PARS) to store VM snapshots efficiently. PARS employs VM introspection technique to explore how a page is used by guest, and classifies the pages by their importance to system execution. If a page is critical, PARS replicates it multiple copies to ensure high availability and long-term durability. Otherwise, the loss of this page causes no harm for system to work properly, PARS therefore saves only one copy of the page. Consequently, PARS improves storage efficiency without compromising availability. We have implemented PARS to justify its practicality. The experimental results demonstrate that PARS achieves 53.9% space saving compared to the native replication approach in HDFS which replicates the whole snapshot file fully and identically.
Funder
National Natural Science Foundation of China
China HGJ Program
China 973 Program
China 863 Program
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Graphics and Computer-Aided Design,Software
Reference55 articles.
1. National center for biotechnology information. ftp://ftp.ncbi.nih.gov. National center for biotechnology information. ftp://ftp.ncbi.nih.gov.
2. Elasticsearch. http://www.elasticsearch.org/. Elasticsearch. http://www.elasticsearch.org/.
3. Hdfs. http://hadoop.apache.org/. Hdfs. http://hadoop.apache.org/.
4. Mummer. http://mummer.sourceforge.net/. Mummer. http://mummer.sourceforge.net/.
5. Mysql. http://www.mysql.com/. Mysql. http://www.mysql.com/.