Affiliation:
1. Xi’an Jiaotong University, Xian Ning west road No.28, Xi’an City, Shaanxi, China
Abstract
In recent years, with the development of flash memory technology, storage systems in large data centers are typically built upon thousands or even millions of solid-state drives (SSDs). Therefore, the failure of SSDs is inevitable. An SSD failure may cause unrecoverable data loss or unavailable system service, resulting in catastrophic results. Active fault detection technologies are able to detect device problems in advance, so it is gaining popularity. Recent trends have turned toward applying AI algorithms based on SSD SMART data for fault detection. However, SMART data of new SSDs contains a large number of features, and the high dimension of data features results in poor accuracy of AI algorithms for fault detection. To tackle the above problems, we improve the structure of traditional Auto-Encoder (AE) based on GRU and propose an SSD fault detection method – GAL based on dimensionality reduction with Gated Recurrent Unit (GRU) sparse autoencoder (GRUAE) by combining the temporal characteristics of SSD SMART data. The proposed method trains the GRUAE model with SSD SMART data firstly, and then adopts the encoder of GRUAE model as the dimensionality reduction tool to reduce the original high-dimensional SSD SMART data, aiming at reducing the influence of noise features in original SSD SAMRT data and highlight the features more relevant to data characteristics to improve the accuracy of fault detection. Finally, LSTM is adopted for fault detection with low-dimensional SSD SMART data. Experimental results on real SSD dataset from Alibaba show that the fault detection accuracy of various AI algorithms can be improved by varying degrees after dimensionality reduction with the proposed method, and GAL performs best among all methods.
Subject
Artificial Intelligence,General Engineering,Statistics and Probability
Reference37 articles.
1. Clouder A. , Pangu – the high performance distributed file system by alibaba cloud, 2018.
2. Amazon S3 for science grids
3. The Google file system
4. Windows Azure Storage
5. SSD Failures in Datacenters
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献