Virtual chunk service based data recovery in a distributed data storage system
US-9921910-B2 · Mar 20, 2018 · US
US10152377B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10152377-B2 |
| Application number | US-201815890913-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 7, 2018 |
| Priority date | Feb 19, 2015 |
| Publication date | Dec 11, 2018 |
| Grant date | Dec 11, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Technology is disclosed for storing data in a distributed storage system using a virtual chunk service (VCS). In the VCS based storage technique, a storage node (“node”) is split into multiple VCSs and each of the VCSs can be assigned a unique ID in the distributed storage. A set of VCSs from a set of nodes form a storage group, which also can be assigned a unique ID in the distributed storage. When a data object is received for storage, a storage group is identified for the data object, the data object is encoded to generate multiple fragments and each fragment is stored in a VCS of the identified storage group. The data recovery process is made more efficient by using metadata, e.g., VCS to storage node mapping, storage group to VCS mapping, VCS to objects mapping, which eliminates resource intensive read and write operations during recovery.
Opening claim text (preview).
What is claimed is: 1. A method comprising: creating a plurality of virtual chunk spaces on each of a plurality of storage nodes of a distributed storage system; assigning to each virtual chunk space a unique identifier that is unique among the virtual chunk spaces in the plurality of storage nodes; creating a plurality of storage groups based on a plurality of storage node grouping schemes, wherein a storage group is a logical container of virtual chunk spaces; for each of the plurality of storage groups, allocating to the storage group a group of the virtual chunk spaces, wherein each virtual chunk space within a group allocated to a storage group is on a different one of the plurality of storage nodes that has a capability to satisfy a corresponding one of the plurality of storage node grouping schemes; and indicating in metadata of the distributed storage system a first set of mappings between the plurality of storage nodes and each plurality of virtual chunk spaces and a second set of mappings between the plurality of storage groups and the groups of virtual chunk spaces as allocated. 2. The method of claim 1 , wherein a virtual chunk space is a smallest unit of a failure domain on a storage node. 3. The method of claim 1 , wherein creating the plurality of virtual chunk spaces on each of the plurality of storage nodes comprises splitting a chunk service on each of the plurality of storage nodes into the plurality of virtual chunk spaces, wherein the chunk service stores data fragments to a storage medium associated with the storage node that hosts the chunk service. 4. The method of claim 1 , wherein a size of a virtual chunk space is based, at least in part, on one of a number of the plurality of storage nodes, sizes of data objects stored in the distributed storage system, and a data protection scheme indicated in a storage node grouping scheme of the plurality of storage node grouping schemes corresponding to the storage group that contains the virtual chunk space. 5. The method of claim 1 , wherein a number of each plurality of virtual chunk spaces is based, at least in part, on storage capacity of the corresponding one of the plurality of storage nodes, storage capacity of the distributed storage system, and a number of the plurality of storage nodes. 6. The method of claim 1 , further comprising: for each of the plurality of storage node grouping schemes, creating at least one storage pool with those of the plurality of storage nodes that satisfy the storage node grouping scheme based, at least in part, on sites of those storage nodes and data protection capability of those storage nodes, wherein allocating to each storage group a group of the virtual chunk spaces comprises assigning the storage group to one of the storage pools that satisfies the storage node grouping scheme corresponding to the storage group and allocating the group of virtual chunk spaces from the storage pool. 7. The method of claim 1 further comprising: based on a request to store a data object into the distributed storage system, assigning the data object to a first of the plurality of storage groups based on a first of the plurality of storage node grouping schemes corresponding to the first storage group; and storing fragments of the data object across the group of virtual chunk spaces allocated to the first storage group. 8. The method of claim 7 , further comprising, based on confirmation of storing the fragments across the group of virtual chunk spaces allocated to the first storage group, updating metadata of the distributed storage system to associate the data object with at least one of each of the group of virtual chunk spaces allocated to the first storage group and the first storage group. 9. The method of claim 7 , wherein a first service assigns the data object to the first storage group and communicates the data object to a second service, wherein storing fragments of the data object across the group of virtual chunk spaces allocated to the first storage group comprises the second service generating the fragments of the data object and communicating each fragment to a different one of the group of virtual chunk spaces for storage into storage media. 10. One or more non-transitory computer-readable storage media comprising program code for layering a distributed storage system for efficient data recovery, the program code comprising instructions to: create a plurality of virtual chunk spaces on each of a plurality of storage nodes of a distributed storage system; assign to each virtual chunk space a unique identifier that is unique among the virtual chunk spaces in the plurality of storage nodes; create a plurality of storage groups based on a plurality of storage node grouping schemes, wherein a storage group is a logical container of virtual chunk spaces; for each of the plurality of storage groups, allocate to the storage group a group of the virtual chunk spaces, wherein each virtual chunk space within a group allocated to a storage group is on a different one of the plurality of storage nodes that has a capability to satisfy a corresponding one of the plurality of storage node grouping schemes; and indicate in metadata of the distributed storage system a first set of mappings between the plurality of storage nodes and each plurality of virtual chunk spaces and a second set of mappings between the plurality of storage groups and the groups of virtual chunk spaces as allocated. 11. The non-transitory computer-readable storage media of claim 10 , wherein a virtual chunk space is a smallest unit of a failure domain on a storage node. 12. The non-transitory computer-readable storage media of claim 10 , wherein the instructions to create the plurality of virtual chunk spaces on each of the plurality of storage nodes comprise instructions to split a chunk service on each of the plurality of storage nodes into the plurality of virtual chunk spaces, wherein the chunk service stores data fragments to a storage medium associated with the storage node that hosts the chunk service. 13. The non-transitory computer-readable storage media of claim 10 , wherein a size of a virtual chunk space is based, at least in part, on one of a number of the plurality of storage nodes, sizes of data objects stored in the distributed storage system, and a data protection scheme indicated in a storage node grouping scheme of the plurality of storage node grouping schemes corresponding to the storage group that contains the virtual chunk space. 14. The non-transitory computer-readable storage media of claim 10 , wherein a number of each plurality of virtual chunk spaces is based, at least in part, on storage capacity of the corresponding one of the plurality of storage nodes, storage capacity of the distributed storage system, and a number of the plurality of storage nodes. 15. The non-transitory machine-readable media of claim 10 , further comprising instructions to: for each of the plurality of storage node grouping schemes, create at least one storage pool with those of the plurality of storage nodes that satisfy the storage node grouping scheme based, at least in part, on sites of those storage nodes and data protection capability of those storage nodes, wherein the instructions to allocate to each storage group a group of the virtual chunk spaces comprise instructions to assign the storage group to one of the storage pools that satisfies the storage node grouping scheme corresponding to the storage group and allocating the group of virtual chunk spaces from the storage pool. 16. The non-transitory computer-readable
for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS] · CPC title
Saving, restoring, recovering or retrying · CPC title
considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration (scheduling strategies G06F9/4881 and subgroups) · CPC title
for recovering from a failure of a protocol instance or entity, e.g. service redundancy protocols, protocol state redundancy or protocol service redirection (management of faults, events, alarms or notifications in data switching networks H04L41/06) · CPC title
Techniques of failing over between control units · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.