Storage efficiency driven migration
US-11561714-B1 · Jan 24, 2023 · US
US12293102B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12293102-B2 |
| Application number | US-202318215414-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 28, 2023 |
| Priority date | Dec 22, 2022 |
| Publication date | May 6, 2025 |
| Grant date | May 6, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques for de-duplicating data involve: determining a target physical block in a first storage device. The techniques further involve: determining a compression ratio of a target data block in a plurality of data blocks to be transferred. The techniques further involve: determining a target hash value of the target data block in response to the compression ratio being lower than a threshold compression ratio. The techniques further involve: determining a de-duplication operation for the target data block based on the target hash value and a de-duplication hash table, the de-duplication hash table storing hash values of data blocks that have been transferred from the first storage device to the second storage device. Accordingly, the amount of data that needs to be transferred can be reduced, and the storage space of the storage devices can be improved, thus increasing the resource utilization and improving the user experience.
Opening claim text (preview).
The invention claimed is: 1. A method for de-duplicating data, comprising: determining a target physical block in a first storage device, a plurality of data blocks in the target physical block being to be transferred to a second storage device; determining a compression ratio of a target data block in the plurality of data blocks; determining a target hash value of the target data block in response to the compression ratio being lower than a threshold compression ratio; and determining a de-duplication operation for the target data block based on the target hash value and a de-duplication hash table, the de-duplication hash table storing hash values of data blocks that have been transferred from the first storage device to the second storage device; wherein determining the de-duplication operation comprises: determining from a plurality of logically contiguous data blocks a group of logically contiguous data blocks starting from the target data block, hash values of the group of logically contiguous data blocks hitting data blocks in the de-duplication hash table that are located in contiguous physical space; determining whether the number of data blocks in the group of logically contiguous data blocks exceeds a threshold number; and de-duplicating the group of logically contiguous data blocks in response to the number exceeding the threshold number. 2. The method according to claim 1 , wherein determining the target physical block comprises: determining a heat of a candidate physical block in the first storage device, the heat indicating how frequently the candidate physical block is accessed; and determining the candidate physical block as the target physical block in response to the heat being less than a threshold heat. 3. The method according to claim 1 , wherein determining the target physical block comprises: determining a storage density of a candidate physical block in the first storage device, the storage density indicating the extent to which the candidate physical block is utilized; and determining the candidate physical block as the target physical block in response to the storage density being lower than a threshold storage density. 4. The method according to claim 1 , wherein determining the compression ratio comprises: acquiring metadata of the target data block; and determining the compression ratio based on the metadata. 5. The method according to claim 1 , wherein determining the de-duplication operation comprises: determining whether the target hash value exists in the de-duplication hash table; and de-duplicating the target data block in response to the target hash value existing in the de-duplication hash table. 6. The method according to claim 5 , wherein determining the de-duplication operation further comprises: transferring the target data block to the second storage device in response to the hash value not existing in the de-duplication hash table; and adding the target hash value to the de-duplication hash table. 7. The method according to claim 1 , wherein determining the de-duplication operation further comprises: determining the plurality of logically contiguous data blocks based on the target data block, the plurality of logically contiguous data blocks taking the target data block as a starting data block; and determining hash values of other data blocks following the target data block in the plurality of logically contiguous data blocks. 8. The method according to claim 7 , wherein determining the de-duplication operation further comprises: determining, in response to the number not exceeding the threshold number, whether the target hash value exists in the de-duplication hash table; and de-duplicating the target data block in response to the target hash value existing in the de-duplication hash table. 9. The method according to claim 1 , wherein the threshold compression ratio is a first threshold compression ratio, and the method further comprises: transferring the target data block to the second storage device in response to the compression ratio being greater than or equal to the first threshold compression ratio. 10. The method according to claim 9 , wherein transferring the target data block to the second storage device comprises: generating a group of data blocks comprising the target data block; compressing the group of data blocks as a whole to determine a group compression ratio; determining whether the group compression ratio is greater than a second threshold compression ratio; and transferring the compressed group of data blocks to the second storage device in response to the group compression ratio being greater than the second threshold compression ratio. 11. The method according to claim 10 , further comprising: decompressing the compressed group of data blocks in response to the group compression ratio being less than or equal to the second threshold compression ratio; compressing the target data block in the decompressed group of data blocks alone; and transferring the compressed target data block to the second storage device. 12. The method according to claim 1 , wherein the first storage device has a shorter device access time than the second storage device. 13. An electronic device, comprising: at least one processor; and a memory coupled to the at least one processor and having instructions stored therein, wherein the instructions, when executed by the at least one processor, cause the device to perform actions comprising: determining a target physical block in a first storage device, a plurality of data blocks in the target physical block being to be transferred to a second storage device; determining a compression ratio of a target data block in the plurality of data blocks; determining a target hash value of the target data block in response to the compression ratio being lower than a threshold compression ratio; and determining a de-duplication operation for the target data block based on the target hash value and a de-duplication hash table, the de-duplication hash table storing hash values of data blocks that have been transferred from the first storage device to the second storage device; wherein determining the de-duplication operation comprises: determining from a plurality of logically contiguous data blocks a group of logically contiguous data blocks starting from the target data block, hash values of the group of logically contiguous data blocks hitting data blocks in the de-duplication hash table that are located in contiguous physical space; determining whether the number of data blocks in the group of logically contiguous data blocks exceeds a threshold number; and de-duplicating the group of logically contiguous data blocks in response to the number exceeding the threshold number. 14. The electronic device according to claim 13 , wherein determining the target physical block comprises: determining a heat of a candidate physical block in the first storage device, the heat indicating how frequently the candidate physical block is accessed; and determining the candidate physical block as the target physical block in response to the heat being less than a threshold heat. 15. The electronic device according to claim 13 , wherein determining the target physical block comprises: determining a storage density of a candidate physical block in the first storage device, the storage density indicating the extent to which the candidate physical block is utilized; and determining the candidate physical block as the target physical block in response to the storage density being lower than a
Saving storage space on storage systems · CPC title
Plurality of storage devices · CPC title
Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS] · CPC title
De-duplication techniques · CPC title
Hybrid storage device · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.