Combining hardware and software approaches for inline data compression
US-9985649-B1 · May 29, 2018 · US
US10585604B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10585604-B2 |
| Application number | US-201815966584-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 30, 2018 |
| Priority date | Apr 30, 2018 |
| Publication date | Mar 10, 2020 |
| Grant date | Mar 10, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Embodiments are directed to techniques for simplifying and automating the process of transitioning a storage object to use inline compression either on the same machine or migrated to a new machine. This may be accomplished by determining the raw compressibility of the data of a storage obj ect, estimating the interaction between the compressibility of the data and a structure of the inline compression feature, and automatically performing the upgrade or migration if the expected compression savings exceeds a threshold. Some embodiments further speed the process and decrease the resources by determining the raw compressibility through sampling. Embodiments are directed to a method, apparatus, system, and computer program product for performing these techniques.
Opening claim text (preview).
What is claimed is: 1. A method, performed by a computing device, of selectively deploying an inline compression feature, the method comprising: scanning preexisting user data stored within a storage object on the computing device to yield a compressibility metric that indicates a representative block-level compressibility of the preexisting user data, the scanning including performing trial compression on blocks of the preexisting user data and generating the compressibility metric based on a compressed size of the blocks and an uncompressed size of the blocks; calculating an overall compression ratio for the storage object based on (i) an overhead parameter specific to the inline compression feature and (ii) the compressibility metric; performing a comparison operation between the overall compression ratio and a threshold minimum value, the comparison operation configured to: (1) produce a first result in response to the overall compression ratio exceeding the threshold minimum value, producing the first result causing the computing device to implement the inline compression feature in connection with the storage object; and (2) produce a second result in response to the threshold minimum value exceeding the overall compression ratio, producing the second result causing the computing device to refrain from implementing the inline compression feature in connection with the storage obj ect; and in response to the comparison operation producing the first result, implementing the inline compression feature on one of the storage object on the computing device and a migrated version of the storage object on a remote computing device. 2. The method of claim 1 wherein performing trial compression on blocks of the preexisting user data includes performing the trial compression on a representative subset of all blocks of the preexisting user data stored within the storage object. 3. The method of claim 2 wherein performing the trial compression on the representative subset of all blocks of the preexisting user data includes selecting the blocks of the representative subset randomly. 4. The method of claim 2 wherein performing the trial compression on the representative subset of all blocks of the preexisting user data includes, for each of a plurality of different types of data stored within the storage object, selecting a respective number of data blocks embodying that type of data in proportion to a respective prevalence of that type of data within the storage object. 5. The method of claim 4 wherein calculating the overall compression ratio includes dividing a product of the maximum group size and the compressibility metric by a sum of the maximum group size and the compressibility metric. 6. The method of claim 2 , wherein the inline compression feature is configured to group a first number of user data blocks together and to store compressed forms of the first number of user data blocks within a second number of data blocks, the second number being smaller than the first number, the inline compression feature being further configured to select as the first number an integer less than or equal to a maximum group size assigned to the inline compression feature; and wherein, when calculating the overall compression ratio, the overhead parameter specific to the inline compression feature is defined to be equal to the maximum group size assigned to the inline compression feature. 7. The method of claim 2 wherein calculating the overall compression ratio includes dividing the compressibility metric minus 1 by the overhead parameter specific to the inline compression feature. 8. The method of claim 7 , wherein the inline compression feature is configured to group a first number of user data blocks together and to store compressed forms of the first number of user data blocks within a second integer number of data blocks, the second number being smaller than the first number, the inline compression feature being further configured to select as the first number an integer less than or equal to a maximum group size assigned to the inline compression feature; and wherein calculating the overall compression ratio further includes constraining the overall compression ratio to not exceed the maximum group size. 9. The method of claim 7 wherein the method further comprises specifying the overhead parameter for the inline compression feature by: scanning a plurality of other storage objects {S 1 , S 2 , . . . , S n } on which the inline compression feature is already implemented to obtain, for each other storage object, S i , of the set of other storage objects: a compressibility metric, that indicates a representative block-level compressibility of user data of that other storage object; and an actual overall compression ratio, R′ i , of that other storage object resulting from the inline compression feature; and calculating the overhead parameter for the inline compression feature by fitting F to the equation F=(R i −1)/(R′ i −1) for the obtained values R′ i and setting the overhead parameter to equal F. 10. The method of claim 2 wherein, when performing the comparison operation between the overall compression ratio and the threshold minimum value, the threshold minimum value is defined to be at least 1.5. 11. The method of claim 1 wherein implementing the inline compression feature on one of the storage object and a migrated version of the storage object includes implementing the inline compression feature on the storage object by: receiving a plurality of new user data blocks from a user directed at the storage object; grouping a first number of the received plurality of new user data blocks together, the inline compression feature being configured to select as the first number an integer less than or equal to a maximum group size assigned to the inline compression feature; storing compressed forms of the first number of new user data blocks within a second integer number of new data blocks of the storage object, the second number being smaller than the first number. 12. The method of claim 11 wherein implementing the inline compression feature on the storage object further includes storing compressed versions of the blocks of the preexisting user data in place of the blocks of the preexisting user data within the storage object. 13. The method of claim 1 wherein implementing the inline compression feature on one of the storage object and a migrated version of the storage object includes implementing the inline compression feature on the migrated storage on the remote computing device object by: directing the remote computing device to implement the inline compression feature on a remote storage object of the remote computing device; sending blocks of the preexisting user data from the storage object on the computing device to the remote storage object of the remote computing device, the remote computing device being configured to repeatedly: group a first number of the blocks of the preexisting user data together, the inline compression feature being configured to select as the first number an integer less than or equal to a maximum group size assigned to the inline compression feature; store compressed forms of the first number of the blocks of the preexisting user data within a second integer number of new data blocks of the remote storage object, the second number being smaller than the first number. 14. A computer program product comprising a non-transitory computer-readable storage medium storing a set of instructions, which, when executed by a computing device, cause the computing device to selectively deploy an
Migration mechanisms · CPC title
Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS] · CPC title
Management of files · CPC title
Configuration or reconfiguration of storage systems · CPC title
Reducing size or complexity of storage systems · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.