Efficient migration of replicated files from a file server having a file de-duplication facility
US-9323758-B1 · Apr 26, 2016 · US
US9832260B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9832260-B2 |
| Application number | US-201414494450-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 23, 2014 |
| Priority date | Sep 23, 2014 |
| Publication date | Nov 28, 2017 |
| Grant date | Nov 28, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Technology is disclosed for a data migration process for a storage server that preserves storage efficiency information. The storage server receives an indication of a group of two or more files and selects among the two or more files a reference file and one or more selected files. The storage server initiates a first migration operation to copy or move the reference file from a source storage server to a destination storage server. The storage server initiates one or more additional migration operations to copy or move the selected files from the source storage server to the destination storage server. At least one of the additional migration operations include a step of transmitting to the destination storage server data blocks of the selected files that are not shared between the reference file and the selected files, but avoid transmitting to the destination storage server the blocks shared with the reference file.
Opening claim text (preview).
What is claimed is: 1. A method comprising: receiving, at a destination computing device, an indication of a reference file and a first selected file; receiving, from a source computing device, data blocks of the reference file and a first data structure corresponding to the first selected file, wherein the first data structure is a bitmap with a bit for each data block of the first selected file, each of the bits indicating whether an associated data block of the first selected file is shared with the reference file as determined by the source computing device; creating a metadata structure for the first selected file comprising references to the shared data blocks in the reference file based, at least in part, on the first data structure; receiving, from the source computing device, data blocks of the first selected file that are not shared between the reference file and the first selected file; and updating the metadata structure for the first selected file with locations of the data blocks of the first selected file that are not shared between the reference file and the first selected file. 2. The method of claim 1 , wherein the first data structure is generated based, at least in part, on a comparison of block locations on the source computing device for data blocks of the reference file with block locations on the source computing device for data blocks of the first selected file. 3. The method of claim 1 , wherein the first data structure also indicates offset numbers within the reference file for the data blocks shared by the first selected file. 4. The method of claim 1 , further comprising: receiving, at the destination computing device, a second data structure corresponding to a second selected file, wherein the second data structure indicates data blocks of the second selected file that are shared with the reference file; creating a metadata structure for the second selected file comprising references to the shared data blocks in the reference file based, at least in part, on the second data structure; and based on receipt of a request to access the second selected file at the destination computing device, sending a request to the source computing device to retrieve data blocks of the second selected file indicated as not shared with the reference file in the second data structure; completing creation of the second selected file, wherein completing creation of the second selected file comprises updating the metadata structure for the second selected file with locations of the retrieved data blocks; and providing the second selected file in response to the request to access the second selected file. 5. The method of claim 1 , further comprising: identifying, at the source computing device, a plurality of files which have shared data blocks, wherein the plurality of files comprises the reference file and the first selected file; and determining a file of the plurality of files to be the reference file based, at least in part, on one or more of a size of the file, an age of the file, and an application which created the file. 6. The method of claim 1 , further comprising: based on determining that at least a first data block shared by the reference file with the first selected has been overwritten on the destination computing device, sending a request to the source computing device to retrieve an unmodified version of the first data block; and updating a reference to the first data block in the metadata structure for the first selected file to reference the retrieved unmodified version of the first data block. 7. A non-transitory machine readable medium comprising program code for efficient storage migration, the program code executable to: receive an indication of a reference file and a first selected file and separately receive data blocks of the reference file and a first data structure corresponding to the first selected file, wherein the first data structure is a bitmap with a bit for each data block of the first selected file, each of the bits indicating whether an associated data block of the first selected file is shared with the reference file as determined by a source computing device; create a metadata structure for the first selected file comprising references to the shared data blocks in the reference file based, at least in part, on the first data structure; receive data blocks of the first selected file that are not shared between the reference file and the first selected file; and update the metadata structure for the first selected file with locations of the data blocks of the first selected file that are not shared between the reference file and the first selected file. 8. The machine readable medium of claim 7 , wherein the first data structure is generated based, at least in part, on a comparison of block locations on the source computing device for data blocks of the reference file with block locations on the source computing device for data blocks of the first selected file. 9. The machine readable medium of claim 7 , wherein the first data structure also indicates offset numbers within the reference file for the data blocks shared by the first selected file. 10. The machine readable medium of claim 7 , further comprising program code executable to: receive a second data structure corresponding to a second selected file, wherein the second data structure indicates data blocks of the second selected file that are shared with the reference file; create a metadata structure for the second selected file comprising references to the shared data blocks in the reference file based, at least in part, on the second data structure; and based on receipt of a request to access the second selected file, send a request to the source computing device to retrieve data blocks of the second selected file indicated as not shared with the reference file in the second data structure; complete creation of the second selected file, wherein the machine executable code which when executed by at least one machine, causes the machine to complete creation of the second selected file comprises machine executable code which when executed by at least one machine, causes the machine to update the metadata structure for the second selected file with locations of the retrieved data blocks; and provide the second selected file in response to the request to access the second selected file. 11. The machine readable medium of claim 7 , further comprising program code executable to: identify a plurality of files which have shared data blocks, wherein the plurality of files comprises the reference file and the first selected file; and determine a file of the plurality of files to be the reference file based, at least in part, on one or more of a size of the file, an age of the file, and an application which created the file. 12. A computing device comprising: a processor; and a machine readable medium comprising machine executable code having stored thereon instructions executable by the processor to cause the computing device to, receive an indication of a reference file and a first selected file; receive, from a source computing device, data blocks of the reference file and a first data structure corresponding to the first selected file, wherein the first data structure is a bitmap with a bit for each data block of the first selected file, each of the bits indicating whether an associated data block of the first selected file is shared with the reference file as determined by the source computing device; create a metadata structure for the first selected file comprising references to the shared data blocks in the reference file based, at least
Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes · CPC title
in relation to data integrity, e.g. data losses, bit errors · CPC title
Hybrid storage combining heterogeneous device types, e.g. hierarchical storage, hybrid arrays · CPC title
for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS] · CPC title
Migration mechanisms · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.