Data migration preserving storage efficiency

US9832260B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9832260-B2
Application numberUS-201414494450-A
CountryUS
Kind codeB2
Filing dateSep 23, 2014
Priority dateSep 23, 2014
Publication dateNov 28, 2017
Grant dateNov 28, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Technology is disclosed for a data migration process for a storage server that preserves storage efficiency information. The storage server receives an indication of a group of two or more files and selects among the two or more files a reference file and one or more selected files. The storage server initiates a first migration operation to copy or move the reference file from a source storage server to a destination storage server. The storage server initiates one or more additional migration operations to copy or move the selected files from the source storage server to the destination storage server. At least one of the additional migration operations include a step of transmitting to the destination storage server data blocks of the selected files that are not shared between the reference file and the selected files, but avoid transmitting to the destination storage server the blocks shared with the reference file.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving, at a destination computing device, an indication of a reference file and a first selected file; receiving, from a source computing device, data blocks of the reference file and a first data structure corresponding to the first selected file, wherein the first data structure is a bitmap with a bit for each data block of the first selected file, each of the bits indicating whether an associated data block of the first selected file is shared with the reference file as determined by the source computing device; creating a metadata structure for the first selected file comprising references to the shared data blocks in the reference file based, at least in part, on the first data structure; receiving, from the source computing device, data blocks of the first selected file that are not shared between the reference file and the first selected file; and updating the metadata structure for the first selected file with locations of the data blocks of the first selected file that are not shared between the reference file and the first selected file. 2. The method of claim 1 , wherein the first data structure is generated based, at least in part, on a comparison of block locations on the source computing device for data blocks of the reference file with block locations on the source computing device for data blocks of the first selected file. 3. The method of claim 1 , wherein the first data structure also indicates offset numbers within the reference file for the data blocks shared by the first selected file. 4. The method of claim 1 , further comprising: receiving, at the destination computing device, a second data structure corresponding to a second selected file, wherein the second data structure indicates data blocks of the second selected file that are shared with the reference file; creating a metadata structure for the second selected file comprising references to the shared data blocks in the reference file based, at least in part, on the second data structure; and based on receipt of a request to access the second selected file at the destination computing device, sending a request to the source computing device to retrieve data blocks of the second selected file indicated as not shared with the reference file in the second data structure; completing creation of the second selected file, wherein completing creation of the second selected file comprises updating the metadata structure for the second selected file with locations of the retrieved data blocks; and providing the second selected file in response to the request to access the second selected file. 5. The method of claim 1 , further comprising: identifying, at the source computing device, a plurality of files which have shared data blocks, wherein the plurality of files comprises the reference file and the first selected file; and determining a file of the plurality of files to be the reference file based, at least in part, on one or more of a size of the file, an age of the file, and an application which created the file. 6. The method of claim 1 , further comprising: based on determining that at least a first data block shared by the reference file with the first selected has been overwritten on the destination computing device, sending a request to the source computing device to retrieve an unmodified version of the first data block; and updating a reference to the first data block in the metadata structure for the first selected file to reference the retrieved unmodified version of the first data block. 7. A non-transitory machine readable medium comprising program code for efficient storage migration, the program code executable to: receive an indication of a reference file and a first selected file and separately receive data blocks of the reference file and a first data structure corresponding to the first selected file, wherein the first data structure is a bitmap with a bit for each data block of the first selected file, each of the bits indicating whether an associated data block of the first selected file is shared with the reference file as determined by a source computing device; create a metadata structure for the first selected file comprising references to the shared data blocks in the reference file based, at least in part, on the first data structure; receive data blocks of the first selected file that are not shared between the reference file and the first selected file; and update the metadata structure for the first selected file with locations of the data blocks of the first selected file that are not shared between the reference file and the first selected file. 8. The machine readable medium of claim 7 , wherein the first data structure is generated based, at least in part, on a comparison of block locations on the source computing device for data blocks of the reference file with block locations on the source computing device for data blocks of the first selected file. 9. The machine readable medium of claim 7 , wherein the first data structure also indicates offset numbers within the reference file for the data blocks shared by the first selected file. 10. The machine readable medium of claim 7 , further comprising program code executable to: receive a second data structure corresponding to a second selected file, wherein the second data structure indicates data blocks of the second selected file that are shared with the reference file; create a metadata structure for the second selected file comprising references to the shared data blocks in the reference file based, at least in part, on the second data structure; and based on receipt of a request to access the second selected file, send a request to the source computing device to retrieve data blocks of the second selected file indicated as not shared with the reference file in the second data structure; complete creation of the second selected file, wherein the machine executable code which when executed by at least one machine, causes the machine to complete creation of the second selected file comprises machine executable code which when executed by at least one machine, causes the machine to update the metadata structure for the second selected file with locations of the retrieved data blocks; and provide the second selected file in response to the request to access the second selected file. 11. The machine readable medium of claim 7 , further comprising program code executable to: identify a plurality of files which have shared data blocks, wherein the plurality of files comprises the reference file and the first selected file; and determine a file of the plurality of files to be the reference file based, at least in part, on one or more of a size of the file, an age of the file, and an application which created the file. 12. A computing device comprising: a processor; and a machine readable medium comprising machine executable code having stored thereon instructions executable by the processor to cause the computing device to, receive an indication of a reference file and a first selected file; receive, from a source computing device, data blocks of the reference file and a first data structure corresponding to the first selected file, wherein the first data structure is a bitmap with a bit for each data block of the first selected file, each of the bits indicating whether an associated data block of the first selected file is shared with the reference file as determined by the source computing device; create a metadata structure for the first selected file comprising references to the shared data blocks in the reference file based, at least

Assignees

Inventors

Classifications

  • Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes · CPC title

  • in relation to data integrity, e.g. data losses, bit errors · CPC title

  • Hybrid storage combining heterogeneous device types, e.g. hierarchical storage, hybrid arrays · CPC title

  • for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS] · CPC title

  • Migration mechanisms · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9832260B2 cover?
Technology is disclosed for a data migration process for a storage server that preserves storage efficiency information. The storage server receives an indication of a group of two or more files and selects among the two or more files a reference file and one or more selected files. The storage server initiates a first migration operation to copy or move the reference file from a source storage…
Who is the assignee on this patent?
Netapp Inc
What technology area does this patent fall under?
Primary CPC classification H04L67/1095. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Nov 28 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).