A novel SSD fault detection method using GRU-based Sparse Auto-Encoder for dimensionality reduction

Author:

Wang Yufei1,Dong Xiaoshe1,Wang Longxiang1,Chen Weiduo1,Chen Heng1

Affiliation:

1. Xi’an Jiaotong University, Xian Ning west road No.28, Xi’an City, Shaanxi, China

Abstract

In recent years, with the development of flash memory technology, storage systems in large data centers are typically built upon thousands or even millions of solid-state drives (SSDs). Therefore, the failure of SSDs is inevitable. An SSD failure may cause unrecoverable data loss or unavailable system service, resulting in catastrophic results. Active fault detection technologies are able to detect device problems in advance, so it is gaining popularity. Recent trends have turned toward applying AI algorithms based on SSD SMART data for fault detection. However, SMART data of new SSDs contains a large number of features, and the high dimension of data features results in poor accuracy of AI algorithms for fault detection. To tackle the above problems, we improve the structure of traditional Auto-Encoder (AE) based on GRU and propose an SSD fault detection method – GAL based on dimensionality reduction with Gated Recurrent Unit (GRU) sparse autoencoder (GRUAE) by combining the temporal characteristics of SSD SMART data. The proposed method trains the GRUAE model with SSD SMART data firstly, and then adopts the encoder of GRUAE model as the dimensionality reduction tool to reduce the original high-dimensional SSD SMART data, aiming at reducing the influence of noise features in original SSD SAMRT data and highlight the features more relevant to data characteristics to improve the accuracy of fault detection. Finally, LSTM is adopted for fault detection with low-dimensional SSD SMART data. Experimental results on real SSD dataset from Alibaba show that the fault detection accuracy of various AI algorithms can be improved by varying degrees after dimensionality reduction with the proposed method, and GAL performs best among all methods.

Publisher

IOS Press

Subject

Artificial Intelligence,General Engineering,Statistics and Probability

Reference37 articles.

1. Clouder A. , Pangu – the high performance distributed file system by alibaba cloud, 2018.

2. Amazon S3 for science grids

3. The Google file system

4. Windows Azure Storage

5. SSD Failures in Datacenters

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3