Efficient erasure coding of large data objects

US10346066B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10346066-B2
Application numberUS-201715423080-A
CountryUS
Kind codeB2
Filing dateFeb 2, 2017
Priority dateJun 27, 2016
Publication dateJul 9, 2019
Grant dateJul 9, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system, computer program product, and computer-executable method for use with a distributed storage system comprising a plurality of storage nodes each having attached storage devices, the system, computer program product, and computer-executable method including receiving a request, at a first storage node of the plurality of storage nodes, to store a large portion of data, using at least one of a first type of data chunk and a plurality of a second type of data chunks to store the large portion of data, processing each of the plurality of the second type of data chunks, processing each of the at least one of the first type of data chunk, and returning an acknowledgement to the request.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-executable method for use with a distributed storage system comprising a plurality of storage nodes each having attached storage devices, the computer-executable method comprising: receiving a request, at a first storage node of the plurality of storage nodes, to store a large portion of data; using at least one of a first type of data chunk and a plurality of a second type of data chunks to store the large portion of data processing each of the plurality of the second type of data chunks by: filling a first data chunk, of the plurality of the second type of data chunks, with a first portion of data from the large portion of data, wherein the first portion of data completely fills the first data chunk; distributing the first data chunk to one of the plurality of storage nodes; retaining, at the first storage node, the content of the first portion of data; executing, on the first storage node, an erasure encoding task to generate coded fragments using the content of the first portion of data; distributing the generated coded fragments to the plurality of storage nodes; and deleting the content of the first portion of data from the first storage node; processing each of the at least one of the first type of data chunk; and returning an acknowledgement to the request. 2. The computer-executable method of claim 1 , wherein the returning an acknowledgment to the request is sent after each of the first type of data chunk and each of the second type of data chunks are protected. 3. The computer-executable method of claim 1 , wherein the large portion of data is represented by at least one of a first type of data chunk and a plurality of a second type of data chunk. 4. The computer-executable method of claim 1 , wherein the processing each of the at least one of the first type of data chunk comprises: mirroring each of the at least one of the first type of data chunks. 5. The computer-executable method of claim 1 , wherein during the distribution of the second type of data chunk and associated generated coded fragments, each of the second type of data chunk and associated generated coded fragments are distributed to unique storage nodes of the plurality of storage nodes. 6. The computer-executable method of claim 1 , further comprising: upon failure to protect any of the second type of data chunk created, returning a failure signal. 7. A system, comprising: a distributed storage system including a plurality of storage nodes each having attached storage devices; and computer-executable program logic encoded in memory of one or more computers enabled for use with the distributed storage system, wherein the computer-executable program logic is configured for the execution of: receiving a request, at a first storage node of the plurality of storage nodes, to store a large portion of data; using at least one of a first type of data chunk and a plurality of a second type of data chunks to store the large portion of data processing each of the plurality of the second type of data chunks by: filling a first data chunk, of the plurality of the second type of data chunks, with a first portion of data from the large portion of data, wherein the first portion of data completely fills the first data chunk; distributing the first data chunk to one of the plurality of storage nodes; retaining, at the first storage node, the content of the first portion of data; executing, on the first storage node, an erasure encoding task to generate coded fragments using the content of the first portion of data; distributing the generated coded fragments to the plurality of storage nodes; and deleting the content of the first portion of data from the first storage node; processing each of the at least one of the first type of data chunk; and returning an acknowledgement to the request. 8. The system of claim 7 , wherein the returning an acknowledgment to the request is sent after each of the first type of data chunk and each of the second type of data chunks are protected. 9. The system of claim 7 , wherein the large portion of data is represented by at least one of a first type of data chunk and a plurality of a second type of data chunk. 10. The system of claim 7 , wherein the processing each of the at least one of the first type of data chunk comprises: mirroring each of the at least one of the first type of data chunks. 11. The system of claim 7 , wherein during the distribution of the second type of data chunk and associated generated coded fragments, each of the second type of data chunk and associated generated coded fragments are distributed to unique storage nodes of the plurality of storage nodes. 12. The system of claim 7 , wherein the computer-executable program logic is further configured for the execution of: upon failure to protect any of the second type of data chunk created, returning a failure signal. 13. A computer program product for use with a distributed storage system comprising a plurality of storage nodes each having attached storage devices, the computer program product comprising: a non-transitory computer readable medium encoded with computer-executable code, the code configured to enable the execution of: receiving a request, at a first storage node of the plurality of storage nodes, to store a large portion of data; using at least one of a first type of data chunk and a plurality of a second type of data chunks to store the large portion of data processing each of the plurality of the second type of data chunks by: filling a first data chunk, of the plurality of the second type of data chunks, with a first portion of data from the large portion of data, wherein the first portion of data completely fills the first data chunk; distributing the first data chunk to one of the plurality of storage nodes; retaining, at the first storage node, the content of the first portion of data; executing, on the first storage node, an erasure encoding task to generate coded fragments using the content of the first portion of data; distributing the generated coded fragments to the plurality of storage nodes; and deleting the content of the first portion of data from the first storage node; processing each of the at least one of the first type of data chunk; and returning an acknowledgement to the request. 14. The computer program product of claim 13 , wherein the returning an acknowledgment to the request is sent after each of the first type of data chunk and each of the second type of data chunks are protected. 15. The computer program product of claim 13 , wherein the large portion of data is represented by at least one of a first type of data chunk and a plurality of a second type of data chunk. 16. The computer program product of claim 13 , wherein the processing each of the at least one of the first type of data chunk comprises: mirroring each of the at least one of the first type of data chunks. 17. The computer program product of claim 13 , wherein during the distribution of the second type of data chunk and associated generated coded fragments, each of the second type of data chunk and associated generated coded fragments are distributed to unique storage nodes of the plurality of storage nodes.

Assignees

Inventors

Classifications

  • G06F3/067Primary

    Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS] · CPC title

  • G06F3/0619Primary

    in relation to data integrity, e.g. data losses, bit errors · CPC title

  • Replication mechanisms · CPC title

  • Reducing size or complexity of storage systems · CPC title

  • Management of blocks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10346066B2 cover?
A system, computer program product, and computer-executable method for use with a distributed storage system comprising a plurality of storage nodes each having attached storage devices, the system, computer program product, and computer-executable method including receiving a request, at a first storage node of the plurality of storage nodes, to store a large portion of data, using at least on…
Who is the assignee on this patent?
Emc Ip Holding Co Llc
What technology area does this patent fall under?
Primary CPC classification G06F3/067. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 09 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).