Generating recommendations for initiating recovery of a fault domain representing logical address space of a storage system

US11314580B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11314580-B2
Application numberUS-202016862740-A
CountryUS
Kind codeB2
Filing dateApr 30, 2020
Priority dateApr 30, 2020
Publication dateApr 26, 2022
Grant dateApr 26, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An apparatus comprises a processing device configured to identify faults associated with a logical address space in a fault domain of a storage system, the faults specifying fault reason codes and metadata types for logical pages in the logical address space associated with the faults. The processing device is also configured to determine a fault summary characterizing impact of the faults in the fault domain of the storage system, the fault summary being based on aggregating fault scores assigned to the fault reason codes and the metadata types specified in the faults. The processing device is further configured to generate a recommendation on whether to initiate recovery of the fault domain of the storage system based on the fault summary, and to initiate recovery of the fault domain of the storage system based on the generated recommendation.

First claim

Opening claim text (preview).

What is claimed is: 1. An apparatus comprising: at least one processing device comprising a processor coupled to a memory; the at least one processing device being configured to perform steps of: identifying a plurality of faults associated with at least a portion of a logical address space in a fault domain of a storage system, the plurality of faults specifying (i) fault reason codes and (ii) metadata types for logical pages in the logical address space associated with the plurality of faults; determining a fault summary characterizing impact of the plurality of faults in the fault domain of the storage system, the fault summary being based at least in part on aggregating fault scores assigned to the fault reason codes and the metadata types specified in the plurality of faults; generating a recommendation on whether to initiate recovery of the fault domain of the storage system based at least in part on the fault summary; and initiating recovery of the fault domain of the storage system based at least in part on the generated recommendation. 2. The apparatus of claim 1 wherein the fault domain comprises all of the logical address space of the storage system. 3. The apparatus of claim 1 wherein the logical address space is organized as a B-tree comprising a plurality of levels, the plurality of levels comprising a leaf logical page level comprising a plurality of leaf pages and one or more additional logical page levels above the leaf logical page level. 4. The apparatus of claim 3 wherein a fault score assigned to a metadata type for the leaf logical page level is lower than a fault score assigned to metadata types for the one or more additional logical page levels above the leaf page logical level. 5. The apparatus of claim 3 wherein the plurality of leaf pages in the leaf logical page level comprise pointers to virtual block addresses associated with entries in a plurality of virtual blocks in a virtual block level of the logical address space, wherein the virtual block addresses comprise pointers to physical block addresses in a plurality of physical blocks in a physical block level of the logical address space, wherein a fault score assigned to a metadata type for the virtual block level is lower than a fault score assigned to a metadata type for the leaf logical page level, and wherein a fault score assigned to a metadata type for the physical block level is lower than a fault score assigned to the metadata type for the virtual block level. 6. The apparatus of claim 3 wherein the one or more additional logical page levels comprise: a middle page level comprising a plurality of middle pages associated with respective subsets of the plurality of leaf pages in the leaf page level; and a top page level comprising one or more top pages associated with respective subsets of the plurality of middle pages in the middle page level. 7. The apparatus of claim 6 wherein a given one of the one or more top pages represents an n*m sized portion of the logical address space that references n of the plurality of middle pages in the middle page level, a given one of the n middle pages represents an m sized portion of the logical address space and references n of the plurality of leaf pages in the leaf page level, and a given one of the n leaf pages represents an m/n sized portion of the logical address space. 8. The apparatus of claim 7 wherein n is 512 and m is one gigabyte. 9. The apparatus of claim 1 wherein a given one of the plurality of faults specifies: a given metadata type for a given logical page in the logical address space that is a source of the given fault; a given fault reason code; a given fault scope characterizing at least one of potential data loss and potential logical address space loss in the storage system resulting from the given fault; a given snapshot group associated with the given logical page; and one or more storage volumes associated with the given logical page. 10. The apparatus of claim 1 wherein the fault summary comprises a set of fault summary parameters for the plurality of faults, the set of fault summary parameters comprising: at least one of a number of unique faults in the plurality of faults, a number of storage volumes impacted by the plurality of faults, and a number of snapshot groups impacted by the plurality of faults; at least one of a total amount of data made unavailable in the storage system as a result of the plurality of faults, a total amount of logical address space made unavailable as a result of the plurality of faults, per-storage volume amounts of data made unavailable in the storage system as a result of the plurality of faults, and per-snapshot group amounts of data made unavailable in the storage system as a result of the plurality of faults; and at least one of a total amount of recoverable data in the storage system, per-storage volume amounts of recoverable data in the storage system, and per-snapshot group amounts of recoverable data in the storage system. 11. The apparatus of claim 1 wherein the fault summary comprises a set of fault summary parameters for the plurality of faults, the set of fault summary parameters comprising: an average fault score for the metadata types specified in the plurality of faults; a standard deviation of the average fault score for the metadata types specified in the plurality of faults; an average fault score for the fault reason codes specified in the plurality of faults; and a standard deviation of the average fault score for the fault reason codes specified in the plurality of faults. 12. The apparatus of claim 11 wherein the fault summary comprises at least one visualization of the set of fault summary parameters, the at least one visualization comprising a plot comprising: data points for each of at least a subset of the plurality of faults, a given data point for a given fault representing the fault score for the metadata type specified in the given fault on a first axis and the fault score for the fault reason code specified in the given fault on a second axis; an additional data point representing the average fault score for the metadata types specified in the plurality of faults on the first axis and the average fault score for the fault reason codes specified in the plurality of faults on the second axis; a first visual indicator of the standard deviation of the average fault score for the metadata types specified in the plurality of faults extending from the additional data point along the first axis; a second visual indicator of the standard deviation of the average fault score for the fault reason codes specified in the plurality of faults extending from the additional data point along the second axis; and a third visual indicator representing a radius of fault scores in the first axis and the second axis that result in generating a recommendation to initiate the recovery of the fault domain of the storage system. 13. The apparatus of claim 11 wherein generating the recommendation on whether to initiate the recovery of the fault domain of the storage system comprises generating a recommendation to initiate the recovery responsive to at least one of: the average fault score for the metadata types specified in the plurality of faults being above a first designated threshold and the standard deviation of the average fault score for the metadata types specified in the plurality of faults being below a second designated threshold; and the average fault score for the fault reason codes specified in the plurality of faults being above a third designated threshold and the standard deviation of the average fault score for the

Assignees

Inventors

Classifications

  • Root cause analysis, i.e. error or fault diagnosis (in a hardware test environment G06F11/22; in a software test environment G06F11/36) · CPC title

  • using page tables, e.g. page table structures · CPC title

  • Multiple device management, e.g. distributing data over multiple flash devices · CPC title

  • Performance improvement · CPC title

  • Remedial or corrective actions (recovery from an exception in an instruction pipeline G06F9/3861; by retry G06F11/1402; for recovering from a failure of a protocol instance or entity H04L69/40) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11314580B2 cover?
An apparatus comprises a processing device configured to identify faults associated with a logical address space in a fault domain of a storage system, the faults specifying fault reason codes and metadata types for logical pages in the logical address space associated with the faults. The processing device is also configured to determine a fault summary characterizing impact of the faults in t…
Who is the assignee on this patent?
Emc Ip Holding Co Llc
What technology area does this patent fall under?
Primary CPC classification G06F11/0793. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 26 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).