Predicting the impact of previously unseen computer system failures on the system using a unified topology
US-2024193023-A1 · Jun 13, 2024 · US
US9229796B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-9229796-B1 |
| Application number | US-201314037199-A |
| Country | US |
| Kind code | B1 |
| Filing date | Sep 25, 2013 |
| Priority date | Sep 25, 2013 |
| Publication date | Jan 5, 2016 |
| Grant date | Jan 5, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques for determining a disk failure indicator for predicting disk failures are described herein. According to one embodiment, diagnostic parameters are received which are collected from a set of known working disks and a set of known failed disks of a storage system. For each of the diagnostic parameters, a first quantile distribution representation is generated for the set of known working disks, and a second quantile distribution representation is generated for the set of known failed disks. The first quantile distribution representation and the second quantile distribution representation of each of the diagnostic parameters are then compared to select one or more of the diagnostic parameters as one or more disk failure indicators for predicting future disk failures.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method of determining a disk failure indicator for predicting disk failures, the method comprising: receiving diagnostic parameters collected from a set of known working disks and a set of known failed disks of a storage system; for each of the diagnostic parameters, generating a first quantile distribution representation for the set of known working disks, and generating a second quantile distribution representation for the set of known failed disks; and comparing the first quantile distribution representation and the second quantile distribution representation of each of the diagnostic parameters to select one or more of the diagnostic parameters as one or more disk failure indicators for predicting future disk failures. 2. The method of claim 1 , wherein the diagnostic parameters comprise at least one of reallocated sector count, medium error, timeout, pending sector count, uncorrectable sector count, connection error, and data error of the working and failed disks. 3. The method of claim 1 , wherein the selected diagnostic parameters comprise a reallocated sector count. 4. The method of claim 1 , wherein comparing the first quantile distribution representation and the second quantile distribution representation comprises: identifying a maximum difference value between the first and second quantile distribution representations for each of the diagnostic parameters; and selecting one of the diagnostic parameters that has a largest maximum difference value amongst the maximum difference values of all diagnostic parameters as the disk failure indicator. 5. The method of claim 1 , wherein generating the first and second quantile distribution representation for each of diagnostic parameters comprise: storing values of the diagnostic parameter of the working disks in a first array and sorting data members of the first array in a predetermined order; storing values of the diagnostic parameter of the failed disks in a second array and sorting data members of the second array in the predetermined order; and plotting the first and second arrays in a first and second curves against a set of fixed intervals representing a number of working disks or failed disks. 6. The method of claim 5 , wherein comparing the first quantile distribution representation and the second quantile distribution representation comprises: for each of diagnostic parameters, identifying a maximum difference value between the first and second curves; and selecting one of the diagnostic parameters, as a disk failure indicator, that has a largest maximum difference value amongst the maximum difference values of all diagnostic parameters. 7. The method of claim 1 , wherein the storage system is a deduplicated backup storage system. 8. A non-transitory machine-readable medium having instructions stored therein, which when executed by a processor, cause the processor to perform a method of determining a disk failure indicator for predicting disk failures, the method comprising: receiving diagnostic parameters collected from a set of known working disks and a set of known failed disks of a storage system; for each of the diagnostic parameters, generating a first quantile distribution representation for the set of known working disks, and generating a second quantile distribution representation for the set of known failed disks; and comparing the first quantile distribution representation and the second quantile distribution representation of each of the diagnostic parameters to select one or more of the diagnostic parameters as one or more disk failure indicators for predicting future disk failures. 9. The non-transitory machine-readable medium of claim 8 , wherein the diagnostic parameters comprise at least one of reallocated sector count, medium error, timeout, pending sector count, uncorrectable sector count, connection error, and data error of the working and failed disks. 10. The non-transitory machine-readable medium of claim 8 , wherein the selected diagnostic parameters comprise a reallocated sector count. 11. The non-transitory machine-readable medium of claim 8 , wherein comparing the first quantile distribution representation and the second quantile distribution representation comprises: identifying a maximum difference value between the first and second quantile distribution representations for each of the diagnostic parameters; and selecting one of the diagnostic parameters that has a largest maximum difference value amongst the maximum difference values of all diagnostic parameters as the disk failure indicator. 12. The non-transitory machine-readable medium of claim 8 , wherein generating the first and second quantile distribution representation for each of diagnostic parameters comprise: storing values of the diagnostic parameter of the working disks in a first array and sorting data members of the first array in a predetermined order; storing values of the diagnostic parameter of the failed disks in a second array and sorting data members of the second array in the predetermined order; and plotting the first and second arrays in a first and second curves against a set of fixed intervals representing a number of working disks or failed disks. 13. The non-transitory machine-readable medium of claim 12 , wherein comparing the first quantile distribution representation and the second quantile distribution representation comprises: for each of diagnostic parameters, identifying a maximum difference value between the first and second curves; and selecting one of the diagnostic parameters, as a disk failure indicator, that has a largest maximum difference value amongst the maximum difference values of all diagnostic parameters. 14. The non-transitory machine-readable medium of claim 8 , wherein the storage system is a deduplicated backup storage system. 15. A data processing system, comprising: a processor; and a memory storing instructions, which when executed from the memory, cause the processor to perform a method, the method including receiving diagnostic parameters collected from a set of known working disks and a set of known failed disks of a storage system, for each of the diagnostic parameters, generating a first quantile distribution representation for the set of known working disks, and generating a second quantile distribution representation for the set of known failed disks, and comparing the first quantile distribution representation and the second quantile distribution representation of each of the diagnostic parameters to select one or more of the diagnostic parameters as one or more disk failure indicators for predicting future disk failures. 16. The system of claim 15 , wherein the diagnostic parameters comprise at least one of reallocated sector count, medium error, timeout, pending sector count, uncorrectable sector count, connection error, and data error of the working and failed disks. 17. The system of claim 15 , wherein the selected diagnostic parameters comprise a reallocated sector count. 18. The system of claim 15 , wherein comparing the first quantile distribution representation and the second quantile distribution representation comprises: identifying a maximum difference value between the first and second quantile distribution representations for each of the diagnostic parameters; and selecting one of the diagnostic parameters that has a largest maximum difference value amongst the maximum difference values of all diagnostic parameters as the disk failure indicator.
Reliability or availability analysis · CPC title
Performance evaluation by statistical analysis · CPC title
in a storage system, e.g. in a DASD or network based storage system (drivers for digital recording or reproducing units G06F3/06; circuits for error detection or correction within digital recording or reproducing units G11B20/18; for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS], H04L67/1097) · CPC title
where the computing system component is a storage system, e.g. DASD based or network based (digital input from or digital output to record carriers G06F3/06; digital recording or reproducing G11B20/18; for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS], H04L67/1097) · CPC title
by exceeding limits · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.