Method for efficiently storing data
US-2024370165-A1 · Nov 7, 2024 · US
US9715434B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-9715434-B1 |
| Application number | US-201113250678-A |
| Country | US |
| Kind code | B1 |
| Filing date | Sep 30, 2011 |
| Priority date | Sep 30, 2011 |
| Publication date | Jul 25, 2017 |
| Grant date | Jul 25, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques for data migration of a storage system are described herein. According to one embodiment, for at least one of segments of a file to be migrated from a source storage tier to a target storage tier, a fingerprint of the segment is transmitted to the target storage tier. In response to a response received from the target storage tier indicating that the segment has not been stored in the target tier based on the fingerprint, a storage space of the target tier estimated for migrating the file is incremented. One or more segments of the file that have not been stored in the target tier are migrated if the one or more segments of the file fit in the target storage tier based on the estimated storage space of the target tier.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method, comprising: for each of a plurality of segments of a file to be migrated from a source storage tier to a target storage tier: determining whether the segment exists in a source candidate index of the source storage tier; in response to determining that the segment does not exist in the source candidate index: transmitting, by a candidate builder of the source storage tier, a fingerprint of the segment to the target storage tier over a network, and in response to a response received from the target storage tier indicating that the segment has not been stored in the target storage tier based on the fingerprint, incrementing an estimated storage space of the target tier estimated for migrating the file based on a size of the segment; after all segments of the file have been processed, communicating with the target storage tier to determine whether the target storage tier has enough storage space to store the segments of the file based on the estimated storage space; in response to determining that the estimated storage space of the target storage tier is sufficient to store the segments of the file: populating the source candidate index to indicate the one or more segments of the file, and migrating, by a migration module of the source storage tier, one or more segments of the file that have not been stored in the target tier, wherein migrating one or more segments of the file occurs without scanning a logical namespace or file system of the source storage tier, wherein the segments are deduplicated segments, and wherein at least one of the segments is referenced by a plurality of files, wherein the target storage tier is configured to match the fingerprint of each segment against a target index structure having information indicating segments that are currently stored in the target storage tier, wherein the target index structure comprises a bit array, wherein the fingerprint of a segment determines an offset to an entry of the bit array, and wherein a value stored in the entry is utilized to indicate whether the segment is currently stored in the target storage tier. 2. The method of claim 1 , wherein the target index structure comprises a bloom filter. 3. The method of claim 1 , further comprising constructing the target index structure, including scanning through all segments that are currently stored in the target storage tier, for each of the segments, performing a hash operation on the segment using a predetermined hash function, generating a hash value, and populating a value in an entry of the target index structure based on the hash value. 4. The method of claim 1 , further comprising if the response indicates that the segment has not been stored in the target storage tier, populating an entry of a source index structure based on the fingerprint of the segment, wherein the source index structure includes a plurality of entries, each corresponding to a segment to be migrated to the target storage tier. 5. The method of claim 4 , further comprising: determining whether an entry corresponding to the fingerprint in the source index structure includes a predetermined value; and incrementing a storage space of the target tier estimated for migrating the file if the entry does not include the predetermined value. 6. The method of claim 4 , wherein segments indicated in the source index structure are migrated to the target storage tier in a bulk manner after all candidate files for migration have been identified and indicated in the source index structure. 7. The method of claim 4 , wherein the source index structure comprises a bloom filter. 8. The method of claim 1 , wherein transmitting a fingerprint of the segment to the target storage tier comprises transmitting fingerprints of a random sample of all segments associated with the file to the target storage tier. 9. A non-transitory computer-readable storage medium having instructions stored therein, which when executed by a computer, cause the computer to perform operations for data migration of a storage system, the operations comprising: for each of a plurality of segments of a file to be migrated from a source storage tier to a target storage tier: determining whether the segment exists in a source candidate index of the source storage tier; in response to determining that the segment does not exist in the source candidate index: transmitting, by a candidate builder of the source storage tier, a fingerprint of the segment to the target storage tier over a network, and in response to a response received from the target storage tier indicating that the segment has not been stored in the target tier based on the fingerprint, incrementing an estimated storage space of the target storage tier estimated for migrating the file based on a size of the segment; after all segments of the file have been processed, communicating with the target storage tier to determine whether the target storage tier has enough storage space to store the segments of the file based on the estimated storage space; in response to determining that the estimated storage space of the target storage tier is sufficient to store the segments of the file: populating the source candidate index to indicate the one or more segments of the file, and migrating, by a migration module of the source storage tier, one or more segments of the file that have not been stored in the target tier, wherein migrating one or more segments of the file occurs without scanning a logical namespace or file system of the source storage, wherein the segments are deduplicated segments, and wherein at least one of the segments is referenced by a plurality of files, wherein the target storage tier is configured to match the fingerprint of each segment against a target index structure having information indicating segments that are currently stored in the target storage tier, wherein the target index structure comprises a bit array, wherein the fingerprint of a segment determines an offset to an entry of the bit array, and wherein a value stored in the entry is utilized to indicate whether the segment is currently stored in the target storage tier. 10. The computer-readable storage medium of claim 9 wherein the target index structure comprises a bloom filter. 11. The computer-readable storage medium of claim 9 , wherein the operations further comprise constructing the target index structure, including scanning through all segments that are currently stored in the target storage tier, for each of the segments, performing a hash operation on the segment using a predetermined hash function, generating a hash value, and populating a value in an entry of the target index structure based on the hash value. 12. The computer-readable storage medium of claim 9 , wherein the operations further comprise if the response indicates that the segment has not been stored in the target storage tier, populating an entry of a source index structure based on the fingerprint of the segment, wherein the source index structure includes a plurality of entries, each corresponding to a segment to be migrated to the target storage tier. 13. The computer-readable storage medium of claim 12 , wherein the operations further comprise: determining whether an entry corresponding to the fingerprint in the source index structure includes a predetermined value; and incrementing a storage space of the target tier estimated for migrating the file if the entry does not include the predetermined value. 14. The computer-readable storage medium of claim 12 , wherein segments indicated in the s
De-duplication techniques · CPC title
Physics · mapped topic
Physics · mapped topic
using de-duplication of the data · CPC title
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.