Method and system for reliably forecasting storage disk failure

US11599402B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11599402-B2
Application numberUS-201916529499-A
CountryUS
Kind codeB2
Filing dateAug 1, 2019
Priority dateAug 1, 2019
Publication dateMar 7, 2023
Grant dateMar 7, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and system for reliably forecasting storage disk failure. Specifically, the method and system disclosed herein entail predicting whether one or more storage disks may fail within a future time period. Further, the storage disk failure forecasts may rely on machine learning classification coupled with prediction reliability scoring.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for forecasting storage disk failure, comprising: obtaining, from an auto-support database, a raw dataset comprising a first set of data tuples, each comprising a feature set and a disk health class, the data tuples include SMART data and SCSI error codes for a plurality of different physical storage disks that have been collected over a preset amount of time; reducing the raw dataset to a select dataset comprising a second set of data tuples, each comprising a feature subset of the feature set and the disk health class; inputting a set of missing data values in the select dataset to obtain the select-gapless dataset comprising a gapless version of the second set of data tuples; initializing a classification learning model; applying incremental learning to the classification learning model using the select-gapless dataset to obtain a set of disk failure forecasts for a set of storage disks; and performing a proactive response based on the set of disk failure forecasts, wherein the proactive response comprises replacing at least one disk from the set of storage disks. 2. The method of claim 1 , further comprising: prior to reducing the raw dataset to the select dataset: identifying the feature subset of the feature set using a set of feature selection algorithms, wherein the feature subset comprises features commonly selected by the set of feature selection algorithms, wherein the raw dataset is reduced based on the feature subset. 3. The method of claim 2 , wherein the set of feature selection algorithms comprises an extreme gradient boosting (XGB) algorithm, a light gradient boosting model (LGBM) algorithm, an extra tree algorithm, a decision tree algorithm, a gradient boost algorithm, an adaptive boosting (AdaBoost) algorithm, and a random forest algorithm. 4. The method of claim 1 , wherein the set of missing data values is imputed using median substitution. 5. The method of claim 1 , wherein the classification learning model is a stochastic gradient descent classifier. 6. The method of claim 1 , wherein the proactive response further comprises alerting a storage system administrator. 7. The method of claim 1 , further comprising: prior to performing the proactive response: applying a prediction reliability algorithm to the set of disk failure forecasts to obtain a set of confidence-credibility scores; and ranking the set of disk failure forecasts based on the set of confidence-credibility scores to obtain a ranked set of disk failure forecasts, wherein the proactive response is performed further based on the ranked set of disk failure forecasts. 8. The method of claim 7 , wherein the prediction reliability algorithm is an inductive conformal prediction (ICP) framework. 9. A system, comprising: an auto-support database operatively connected to a disk failure forecasting service, the disk failure forecasting service comprising a computer processor configured to: obtain, from an auto-support database, a raw dataset comprising a first set of data tuples, each comprising a feature set and a disk health class, the data tuples include SMART data and SCSI error codes for a plurality of different physical storage disks that have been collected over a preset amount of time; reduce the raw dataset to a select dataset comprising a second set of data tuples, each comprising a feature subset of the feature set and the disk health class; input a set of missing data values in the select dataset to obtain the select-gapless dataset comprising a gapless version of the second set of data tuples; initialize a classification learning model; apply incremental learning to the classification learning model using the select-gapless dataset to obtain a set of disk failure forecasts for a set of storage disks; and perform a proactive response based on the set of disk failure forecasts, wherein the proactive response comprises replacing at least one disk from the set of storage disks. 10. The system of claim 9 , further comprising: a storage system operatively connected to the auto-support database, and comprising a plurality of storage disks, wherein the raw dataset comprises historical configuration and performance information for the plurality of storage disks. 11. The system of claim 9 , further comprising: the sales client, wherein the sales client is operatively connected to the disk failure forecasting service. 12. The system of claim 9 , further comprising: an admin client operatively connected to the disk failure forecasting service, wherein the proactive response comprises issuing an alert to the admin client. 13. A non-transitory computer readable medium (CRM) comprising computer readable program code, which when executed by a computer processor, enables the computer processor to: obtain, from an auto-support database, a raw dataset comprising a first set of data tuples, each comprising a feature set and a disk health class, the data tuples include SMART data and SCSI error codes for a plurality of different physical storage disks that have been collected over a preset amount of time; reduce the raw dataset to a select dataset comprising a second set of data tuples, each comprising a feature subset of the feature set and the disk health class; input a set of missing data values in the select dataset to obtain the select-gapless dataset comprising a gapless version of the second set of data tuples; initialize a classification learning model; apply incremental learning to the classification learning model using the select-gapless dataset to obtain a set of disk failure forecasts for a set of storage disks; and perform a proactive response based on the set of disk failure forecasts, wherein the proactive response comprises replacing at least one disk from the set of storage disks. 14. The non-transitory CRM of claim 13 , further comprising computer readable program code, which when executed by the computer processor, enables the computer processor to reduce the raw dataset to the select dataset, by: identifying the feature subset of the feature set using a set of feature selection algorithms; and reducing the raw dataset based on the feature subset, wherein the feature subset comprises features commonly selected by the set of feature selection algorithms. 15. The non-transitory CRM of claim 13 , wherein the classification learning model is a stochastic gradient descent classifier. 16. The non-transitory CRM of claim 13 , further comprising computer readable program code, which when executed by the computer processor, enables the computer processor, prior to performing the proactive response, to: apply a prediction reliability algorithm to the set of disk failure forecasts to obtain a set of confidence-credibility scores; and rank the set of disk failure forecasts based on the set of confidence-credibility scores to obtain a ranked set of disk failure forecasts, wherein the proactive response is performed further based on the ranked set of disk failure forecasts.

Assignees

Inventors

Classifications

  • in a storage system, e.g. in a DASD or network based storage system (drivers for digital recording or reproducing units G06F3/06; circuits for error detection or correction within digital recording or reproducing units G11B20/18; for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS], H04L67/1097) · CPC title

  • Machine learning · CPC title

  • Reliability or availability analysis · CPC title

  • G06F3/0616Primary

    in relation to life time, e.g. increasing Mean Time Between Failures [MTBF] · CPC title

  • Disk arrays, e.g. RAID, JBOD · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11599402B2 cover?
A method and system for reliably forecasting storage disk failure. Specifically, the method and system disclosed herein entail predicting whether one or more storage disks may fail within a future time period. Further, the storage disk failure forecasts may rely on machine learning classification coupled with prediction reliability scoring.
Who is the assignee on this patent?
Emc Ip Holding Co Llc
What technology area does this patent fall under?
Primary CPC classification G06F11/0727. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 07 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).