Distributed storage data recovery

US10025512B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10025512-B2
Application numberUS-201415305517-A
CountryUS
Kind codeB2
Filing dateJun 17, 2014
Priority dateJun 17, 2014
Publication dateJul 17, 2018
Grant dateJul 17, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Processing data in a distributed data storage system generates a sparse check matrix correlating data elements to data syndromes. The system receives notification of a failed node in the distributed data storage system, accesses the sparse check matrix, and determines from the sparse check matrix a correlation between a data element and a syndrome. The system processes a logical operation on the data element and the syndrome and recovers the failed node.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of processing data in a distributed data storage system, comprising: generating a sparse check matrix specifying correlations between data elements stored in the distributed data storage system and syndromes, wherein, for each of the syndromes, the respective syndrome is correlated with a partial subset of the data elements, the partial subset comprising those of the data elements that were used to generate the respective syndrome; in response to receiving notification of a failed node in the distributed data storage system: accessing the sparse check matrix; identifying from the sparse check matrix one of the syndromes with which a first data element on the failed node is correlated; identifying from the sparse check matrix a second data element that is correlated with the identified syndrome; recovering the first data element by performing a logical operation on the second data element and the identified syndrome. 2. The method according to claim 1 , wherein the second data element, the first data element, and the identified syndrome are stored within a given fault zone, and the method further comprising locally reconstructing the failed node at the given fault zone. 3. The method according to claim 1 , wherein the second data element and the first data element are stored within different fault zones. 4. The method according to claim 1 , wherein the second data element and the first data element are stored at different geographical locations. 5. The method according to claim 1 , wherein generating the sparse check matrix comprises calling a progressive edge growth algorithm. 6. The method according to claim 1 , wherein the sparse check matrix comprises a permutated sparse check matrix. 7. The method according to claim 1 , wherein the sparse check matrix comprises a transformed sparse check matrix. 8. The method according to claim 1 , wherein the sparse check matrix comprises a non-binary sparse check matrix. 9. The method according to claim 1 , wherein the logical operation is an XOR operator. 10. A computer system comprising: a processor; a non-transitory storage medium; and disk management instructions stored in the non-transitory storage medium that are to, when executed by the processor, cause the processor to: generate a sparse check matrix specifying correlations between data elements stored in a plurality of storage devices and syndromes, wherein, for each of the syndromes, the respective syndrome is correlated with a partial subset of the data elements, the partial subset comprising those of the data elements that were used to generate the respective syndrome; and in response to one of the plurality of storage devices failing: identify from the sparse check matrix one of the syndromes to which a first data element on the failed storage device is correlated; identify from the sparse check matrix a second data element that is correlated with the identified syndrome; perform a logical operation on the second data element and the identified syndrome to recover the first data element. 11. The system according to claim 10 , wherein the data elements are stored in a plurality of fault zones, the second data element and the first data element are stored within a given fault zone of the plurality of fault zones, and the instructions are to cause the processor to recover the failed node by locally reconstructing the failed node at the given fault zone. 12. The system according to claim 10 , wherein the data elements are stored in a plurality of fault zones, and the second data element and the first data element are stored within different ones of the plurality of fault zones. 13. The system according to claim 10 , wherein the second data element and the first data element are stored at different geographical locations. 14. The system according to claim 10 , wherein the instructions are to cause the processor to generate the sparse check matrix by calling a progressive edge growth algorithm. 15. The system of claim 10 , wherein, for each of the syndromes, at least half of the data elements are not correlated with the respective syndrome in the spare check matrix. 16. The system of claim 10 , wherein the data elements are stored in a plurality of first fault zones, and wherein the sparse check matrix specifies correlations for: first syndromes that correspond respectively to the first fault zones, and second syndromes that each are correlated with two of the data elements, wherein each of the first syndromes is stored in its corresponding first fault zone and is correlated with each of the data elements that is stored in its corresponding first fault zone; and wherein, for each of the second syndromes, the respective second syndrome and each of the data elements correlated with it are stored in different fault zones. 17. The system of claim 16 , wherein, for each of the first syndromes, the respective first syndrome is not correlated to any of the data elements that are not stored in the corresponding first fault zone. 18. The system of claim 16 , wherein, for each of the second syndromes, the respective second syndrome is stored in a second fault zone that does not store any of the data elements. 19. The system of claim 16 , wherein for each of the second syndromes, the respective second syndrome is correlated to exactly two of the data elements. 20. A non-transitory computer readable storage medium that stores a computer program for processing data in a distributed data storage system, said computer program comprising a set of instructions to: generate a sparse check matrix specifying correlations between data elements stored in a plurality of storage devices and syndromes, wherein, for each of the syndromes, the respective syndrome is correlated with a partial subset of the data elements, the partial subset comprising those of the data elements that were used to generate the respective syndrome; and in response to one of the plurality of storage devices failing: identify from the sparse check matrix one of the syndromes with which a first data element on the failed storage device is correlated; identify from the sparse check matrix a second data element that is correlated with the identified syndrome; perform a logical operation on the second data element and the identified syndrome to recover the first data element.

Assignees

Inventors

Classifications

  • Matrix operations, especially for generator matrices or check matrices, e.g. column or row permutations · CPC title

  • G06F3/0619Primary

    in relation to data integrity, e.g. data losses, bit errors · CPC title

  • Management of blocks · CPC title

  • Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS] · CPC title

  • Reconstruction on already foreseen single or plurality of spare disks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10025512B2 cover?
Processing data in a distributed data storage system generates a sparse check matrix correlating data elements to data syndromes. The system receives notification of a failed node in the distributed data storage system, accesses the sparse check matrix, and determines from the sparse check matrix a correlation between a data element and a syndrome. The system processes a logical operation on th…
Who is the assignee on this patent?
Hewlett Packard Entpr Dev Lp
What technology area does this patent fall under?
Primary CPC classification G06F3/0619. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 17 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).