Efficient migration to distributed storage

US10564883B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10564883-B2
Application numberUS-201715620897-A
CountryUS
Kind codeB2
Filing dateJun 13, 2017
Priority dateDec 13, 2016
Publication dateFeb 18, 2020
Grant dateFeb 18, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer program product, system, and method for determining a list of objects, within source storage, to migrate; generating a chunk layout for the objects to migrate; and for each unencoded chunk within the chunk layout: retrieving objects from source storage specified by the unencoded chunk within the chunk layout; generating data and coded fragments for the unencoded chunk using the retrieved objects; and storing the data and coded fragments to primary storage.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for migrating objects to a distributed storage system, comprising: determining, by a migration service implemented at a target storage of the distributed storage system, a list of objects within a source storage to migrate to the target storage; performing pull migration, by the migration service of the target storage against the source storage, for objects in the list of objects, the pull migration comprising: querying the source storage to determine relationships among the list of objects to migrate; generating a chunk layout for the objects to migrate based on the relationships, the generating including placing related objects, responsive to the querying, within contiguous chunk segments and/or within contiguous chunks; and for each unencoded chunk within the chunk layout: retrieving objects from source storage specified by the unencoded chunk within the chunk layout; generating, by a chunk encoding service at the target storage, data and coded fragments for the unencoded chunk using the retrieved objects; and storing the data and coded fragments to primary storage at the target storage. 2. The method of claim 1 wherein determining the list of objects within source storage to migrate includes querying the source storage for object sizes of the objects. 3. The method of claim 1 wherein generating data and coded fragments for each unencoded chunk within the chunk layout includes generating data and coded fragments in parallel across multiple nodes of the distributed storage system. 4. The method of claim 1 , wherein the unencoded chunks of the chunk layout are each of a fixed size. 5. The method of claim 4 , wherein one chunk stores data for multiple objects and a single object is stored across multiple chunks. 6. The method of claim 1 wherein the chunk layout is generated before encoding begins on migrated data. 7. The method of claim 1 , wherein no memory is allocated for a chunk defined by the chunk layout until corresponding data for allocation to the chunk is encoded. 8. The method of claim 1 , wherein retrieving objects from source storage specified by the unencoded chunk within the chunk layout, generating the data and coded fragments for the unencoded chunk using the retrieved objects, and storing the data and coded fragments to primary storage at the target storage are implemented between a time the data is initially stored at the source storage and a time of completion of the coding, wherein the pull migration is performed absent an intermediate data protection scheme. 9. The method of claim 1 , wherein pull migration is implemented absent performing an intermediate data protection scheme. 10. The method of claim 1 , wherein the related objects include objects that are from a same bucket. 11. The method of claim 1 , wherein generating the data and coded fragments includes dividing the data into k fixed size data fragments and generating m coded fragments from the k fixed size data fragments, wherein k is different from m. 12. The method of claim 11 , further comprising: upon determining k+m>available number of nodes in the distributed storage system, storing at an available one of the nodes, multiple k data fragments and/or multiple m coded fragments. 13. A system for migrating objects to a distributed storage system, comprising: a processor at a target storage of the distributed storage system; a volatile memory; and a non-volatile memory storing computer program code that when executed on the processor causes the processor to execute a process operable to perform the operations of: determining, by a migration service implemented at the target storage, a list of objects within a source storage to migrate to the target storage; performing pull migration, by the migration service of the target storage against the source storage, for objects in the list of objects, the pull migration comprising: querying the source storage to determine relationships among the list of objects to migrate; generating a chunk layout for the objects to migrate based on the relationships, the generating including placing related objects, responsive to the querying, within contiguous chunk segments and/or within contiguous chunks; and for each unencoded chunk within the chunk layout: retrieving objects from source storage specified by the unencoded chunk within the chunk layout; generating, by a chunk encoding service at the target storage, data and coded fragments for the unencoded chunk using the retrieved objects; and storing the data and coded fragments to primary storage at the target storage. 14. The system of claim 13 wherein determining the list of objects within source storage to migrate includes querying the source storage for object sizes of the objects. 15. The system of claim 13 wherein generating data and coded fragments for each unencoded chunk within the chunk layout includes generating data and coded fragments in parallel across multiple nodes of the distributed storage system. 16. A computer program product tangibly embodied in a non-transitory computer-readable medium, the computer-readable medium storing program instructions for migrating objects to a distributed storage system, the instructions are executable to: determine, by a migration service implemented at a target storage of the distributed storage system, a list of objects within a source storage to migrate to the target storage; perform pull migration, by the migration service of the target storage against the source storage, for objects in the list of objects, the pull migration comprising: querying the source storage to determine relationships among the list of objects to migrate; generating a chunk layout for the objects to migrate based on the relationships, the generating including placing related objects, responsive to the querying, within contiguous chunk segment and/or within contiguous chunks; and for each unencoded chunk within the chunk layout: retrieving objects from source storage specified by the unencoded chunk within the chunk layout; generate, by a chunk encoding service at the target storage, data and coded fragments for the unencoded chunk using the retrieved objects; and storing the data and coded fragments to primary storage at the target storage. 17. The computer program product of claim 16 wherein determining the list of objects within source storage to migrate includes querying the source storage for object sizes of the objects. 18. The computer program product of claim 16 wherein generating data and coded fragments for each unencoded chunk within the chunk layout includes generating data and coded fragments in parallel across multiple nodes of the distributed storage system.

Assignees

Inventors

Classifications

  • G06F3/0647Primary

    Migration mechanisms · CPC title

  • Distributed file systems · CPC title

  • Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS] · CPC title

  • Improving I/O performance · CPC title

  • Indexing structures · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10564883B2 cover?
A computer program product, system, and method for determining a list of objects, within source storage, to migrate; generating a chunk layout for the objects to migrate; and for each unencoded chunk within the chunk layout: retrieving objects from source storage specified by the unencoded chunk within the chunk layout; generating data and coded fragments for the unencoded chunk using the retri…
Who is the assignee on this patent?
Emc Ip Holding Co Llc
What technology area does this patent fall under?
Primary CPC classification G06F3/0647. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 18 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).