Virtual chunk service based data recovery in a distributed data storage system
US-9921910-B2 · Mar 20, 2018 · US
US10564883B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10564883-B2 |
| Application number | US-201715620897-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 13, 2017 |
| Priority date | Dec 13, 2016 |
| Publication date | Feb 18, 2020 |
| Grant date | Feb 18, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A computer program product, system, and method for determining a list of objects, within source storage, to migrate; generating a chunk layout for the objects to migrate; and for each unencoded chunk within the chunk layout: retrieving objects from source storage specified by the unencoded chunk within the chunk layout; generating data and coded fragments for the unencoded chunk using the retrieved objects; and storing the data and coded fragments to primary storage.
Opening claim text (preview).
What is claimed is: 1. A method for migrating objects to a distributed storage system, comprising: determining, by a migration service implemented at a target storage of the distributed storage system, a list of objects within a source storage to migrate to the target storage; performing pull migration, by the migration service of the target storage against the source storage, for objects in the list of objects, the pull migration comprising: querying the source storage to determine relationships among the list of objects to migrate; generating a chunk layout for the objects to migrate based on the relationships, the generating including placing related objects, responsive to the querying, within contiguous chunk segments and/or within contiguous chunks; and for each unencoded chunk within the chunk layout: retrieving objects from source storage specified by the unencoded chunk within the chunk layout; generating, by a chunk encoding service at the target storage, data and coded fragments for the unencoded chunk using the retrieved objects; and storing the data and coded fragments to primary storage at the target storage. 2. The method of claim 1 wherein determining the list of objects within source storage to migrate includes querying the source storage for object sizes of the objects. 3. The method of claim 1 wherein generating data and coded fragments for each unencoded chunk within the chunk layout includes generating data and coded fragments in parallel across multiple nodes of the distributed storage system. 4. The method of claim 1 , wherein the unencoded chunks of the chunk layout are each of a fixed size. 5. The method of claim 4 , wherein one chunk stores data for multiple objects and a single object is stored across multiple chunks. 6. The method of claim 1 wherein the chunk layout is generated before encoding begins on migrated data. 7. The method of claim 1 , wherein no memory is allocated for a chunk defined by the chunk layout until corresponding data for allocation to the chunk is encoded. 8. The method of claim 1 , wherein retrieving objects from source storage specified by the unencoded chunk within the chunk layout, generating the data and coded fragments for the unencoded chunk using the retrieved objects, and storing the data and coded fragments to primary storage at the target storage are implemented between a time the data is initially stored at the source storage and a time of completion of the coding, wherein the pull migration is performed absent an intermediate data protection scheme. 9. The method of claim 1 , wherein pull migration is implemented absent performing an intermediate data protection scheme. 10. The method of claim 1 , wherein the related objects include objects that are from a same bucket. 11. The method of claim 1 , wherein generating the data and coded fragments includes dividing the data into k fixed size data fragments and generating m coded fragments from the k fixed size data fragments, wherein k is different from m. 12. The method of claim 11 , further comprising: upon determining k+m>available number of nodes in the distributed storage system, storing at an available one of the nodes, multiple k data fragments and/or multiple m coded fragments. 13. A system for migrating objects to a distributed storage system, comprising: a processor at a target storage of the distributed storage system; a volatile memory; and a non-volatile memory storing computer program code that when executed on the processor causes the processor to execute a process operable to perform the operations of: determining, by a migration service implemented at the target storage, a list of objects within a source storage to migrate to the target storage; performing pull migration, by the migration service of the target storage against the source storage, for objects in the list of objects, the pull migration comprising: querying the source storage to determine relationships among the list of objects to migrate; generating a chunk layout for the objects to migrate based on the relationships, the generating including placing related objects, responsive to the querying, within contiguous chunk segments and/or within contiguous chunks; and for each unencoded chunk within the chunk layout: retrieving objects from source storage specified by the unencoded chunk within the chunk layout; generating, by a chunk encoding service at the target storage, data and coded fragments for the unencoded chunk using the retrieved objects; and storing the data and coded fragments to primary storage at the target storage. 14. The system of claim 13 wherein determining the list of objects within source storage to migrate includes querying the source storage for object sizes of the objects. 15. The system of claim 13 wherein generating data and coded fragments for each unencoded chunk within the chunk layout includes generating data and coded fragments in parallel across multiple nodes of the distributed storage system. 16. A computer program product tangibly embodied in a non-transitory computer-readable medium, the computer-readable medium storing program instructions for migrating objects to a distributed storage system, the instructions are executable to: determine, by a migration service implemented at a target storage of the distributed storage system, a list of objects within a source storage to migrate to the target storage; perform pull migration, by the migration service of the target storage against the source storage, for objects in the list of objects, the pull migration comprising: querying the source storage to determine relationships among the list of objects to migrate; generating a chunk layout for the objects to migrate based on the relationships, the generating including placing related objects, responsive to the querying, within contiguous chunk segment and/or within contiguous chunks; and for each unencoded chunk within the chunk layout: retrieving objects from source storage specified by the unencoded chunk within the chunk layout; generate, by a chunk encoding service at the target storage, data and coded fragments for the unencoded chunk using the retrieved objects; and storing the data and coded fragments to primary storage at the target storage. 17. The computer program product of claim 16 wherein determining the list of objects within source storage to migrate includes querying the source storage for object sizes of the objects. 18. The computer program product of claim 16 wherein generating data and coded fragments for each unencoded chunk within the chunk layout includes generating data and coded fragments in parallel across multiple nodes of the distributed storage system.
Migration mechanisms · CPC title
Distributed file systems · CPC title
Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS] · CPC title
Improving I/O performance · CPC title
Indexing structures · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.