Affiliation:
1. Xi’an Jiaotong University, Xi’an City, Shaanxi, China
Abstract
In recent years, researches on disk fault detection based on SMART data combined with different machine learning algorithms have been proven to be effective. However, these methods require a large amount of data. In the early stages of the establishment of a data center or the deployment of new storage devices, the amount of reliability data for disks is relatively limited, and the amount of failed disk data is even less, resulting in the unsatisfactory detection performances of machine learning algorithms.
To solve the above problems, we propose a novel small sample disk fault detection (SSDFD)
1
optimizing method based on Generative Adversarial Networks (GANs). Combined with the characteristics of hard disk reliability data, the generator of the original GAN is improved based on Long Short-Term Memory (LSTM), making it suitable for the generation of failed disk data. To alleviate the problem of data imbalance and expand the failed disk dataset with reduced amounts of original data, the proposed model is trained through adversarial training, which focuses on the generation of failed disk data. Experimental results on real HDD datasets show that SSDFD can generate enough virtual failed disk data to enable the machine learning algorithm to detect disk faults with increased accuracy under the condition of a few original failed disk data. Furthermore, the model trained with 300 original failed disk data has a significant effect on improving the accuracy of HDD fault detection. The optimal amount of generated virtual data are, 20–30 times that of the original data.
Funder
National Key Research and Development Plan of China
Publisher
Association for Computing Machinery (ACM)
Subject
Hardware and Architecture,Information Systems,Software
Reference44 articles.
1. Monitoring hard disks with smart;Allen Bruce;Linux Journal,2004
2. Backblaze. 2020. The Backblaze Hard Drive Data and Stats. Retrieved October 20 2020 from https://www.backblaze.com/b2/hard-drive-test-data.html.
3. Predicting Disk Replacement towards Reliable Data Centers
4. Bagging predictors
5. Windows Azure Storage
Cited by
9 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献