Data deduplication in a block-based storage system
US-2016350325-A1 · Dec 1, 2016 · US
US10185504B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-10185504-B1 |
| Application number | US-201514952232-A |
| Country | US |
| Kind code | B1 |
| Filing date | Nov 25, 2015 |
| Priority date | Nov 26, 2014 |
| Publication date | Jan 22, 2019 |
| Grant date | Jan 22, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method for reducing an amount of data transmitted during a backup process is described. The method may include receiving input data to insert into a rating hash table during the backup process. The method may further include selecting, based on a hash function, a bucket of the rating hash table in which the input data will be inserted, the bucket including a plurality of blocks. The method may also include, in response to determining that the input data has already been inserted in one of the plurality of blocks, increasing a rating corresponding to the one of the plurality of blocks by a popularity rating increment. The method may additionally include, in response to determining that the input data has not already been inserted in one of the plurality of blocks, determining a first block with a smallest rating from the plurality of blocks.
Opening claim text (preview).
What is claimed is: 1. A method for reducing an amount of data transmitted during a backup process, the method comprising: receiving input data to insert into a rating hash table during the backup process; selecting, based on a hash function, a bucket of the rating hash table in which the input data will be inserted, the bucket including a plurality of blocks; in response to determining that the input data has already been inserted in one of the plurality of blocks: increasing a rating corresponding to the one of the plurality of blocks by a popularity rating increment; in response to determining that the input data has not already been inserted in one of the plurality of blocks: determining a first block with a smallest rating from the plurality of blocks; inserting the input data in the first block with the smallest rating; in response to determining that the input data has already been inserted in one of the plurality of blocks: decreasing a rating corresponding to the other blocks of the plurality of blocks by an aging rating increment; and determining data that does not need to be transmitted from a client to a server using the backup process based on hash values having higher assigned ratings from popularity increment increases, wherein hash values from the client are rated and data associated with such hash values is selectively backed up to the server. 2. The method of claim 1 , further comprising: setting a rating corresponding to the first block to an initial rating value. 3. The method of claim 1 , further comprising: for each block of the plurality of blocks besides the first block, decreasing a corresponding rating by an aging rating increment. 4. The method of claim 1 , wherein the input data was sent or requested by a cloud computing client being backed up in a virtualization platform. 5. The method of claim 1 , wherein the input data was sent or requested by a cloud computing client being backed up in a virtualization platform, the rating hash table resides with the cloud computing client, and a reduced amount of data is transmitted during a compare operation of the backup process based on the rating hash table. 6. A computer program product residing on a computer readable storage medium having a plurality of instructions stored thereon, which, when executed by a processor, causes the processor to perform operations for reducing an amount of data transmitted during a backup process, the operations comprising: receiving input data to insert into a rating hash table during the backup process; selecting, based on a hash function, a bucket of the rating hash table in which the input data will be inserted, the bucket including a plurality of blocks; in response to determining that the input data has already been inserted in one of the plurality of blocks: increasing a rating corresponding to the one of the plurality of blocks by a popularity rating increment; in response to determining that the input data has not already been inserted in one of the plurality of blocks: determining a first block with a smallest rating from the plurality of blocks; inserting the input data in the first block; in response to determining that the input data has already been inserted in one of the plurality of blocks: decreasing a rating corresponding to the other blocks of the plurality of blocks by an aging rating increment; and determining data that does not need to be transmitted from a client to a server using the backup process based on hash values having higher assigned ratings from popularity increment increases, wherein hash values from the client are rated and data associated with such hash values is selectively backed up to the server. 7. The computer program product of claim 6 , wherein the operations further comprise: setting a rating corresponding to the first block to an initial rating value. 8. The computer program product of claim 6 , wherein the operations further comprise: for each block of the plurality of blocks besides the first block, decreasing a corresponding rating by an aging rating increment. 9. The computer program product of claim 6 , wherein the operations further comprise: in response to determining that the input data has already been inserted in one of the plurality of blocks: decreasing a rating corresponding to the other blocks of the plurality of blocks by an aging rating increment. 10. The computer program product of claim 6 , wherein the input data replaces data already in the first block upon being inserted. 11. The computer program product of claim 6 , wherein the input data was sent or requested by a cloud computing client being backed up in a virtualization platform, the rating hash table resides with the cloud computing client, and a reduced amount of data is transmitted during a compare operation of the backup process based on the rating hash table. 12. A computing system for reducing an amount of data transmitted during a backup process, the computing system comprising one or more processors, wherein the one or more processors are configured to: receive input data to insert into a rating hash table during the backup process; select, based on a hash function, a bucket of the rating hash table in which the input data will be inserted, the bucket including a plurality of blocks; in response to determining that the input data has already been inserted in one of the plurality of blocks: increase a rating corresponding to the one of the plurality of blocks by a popularity rating increment; in response to determining that the input data has not already been inserted in one of the plurality of blocks: determine a first block with a smallest rating from the plurality of blocks; insert the input data in the first block; in response to determining that the input data has already been inserted in one of the plurality of blocks: decrease a rating corresponding to the other blocks of the plurality of blocks by an aging rating increment; and determine data that does not need to be transmitted from a client to a server using the backup process based on hash values having higher assigned ratings from popularity increment increases, wherein hash values from the client are rated and data associated with such hash values is selectively backed up to the server. 13. The computing system of claim 12 , wherein the one or more processors are further configured to: set a rating corresponding to the first block to an initial rating value. 14. The computing system of claim 12 , wherein the one or more processors are further configured to: for each block of the plurality of blocks besides the first block, decrease a corresponding rating by an aging rating increment. 15. The computing system of claim 12 , wherein the one or more processors are further configured to: in response to determining that the input data has already been inserted in one of the plurality of blocks: decrease a rating corresponding to the other blocks of the plurality of blocks by an aging rating increment. 16. The computing system of claim 12 , wherein the input data replaces data already in the first block upon being inserted. 17. The computing system of claim 12 , wherein the input data was sent or requested by a cloud computing client being backed up in a virtualization platform, the rating hash table resides with the cloud computing client, and a reduced amount of data is transmitted during a compare operation of the backup process based on the rating hash table. 18. A system for reducing an amount of data
Management of blocks · CPC title
involving hashing techniques, e.g. inverted page tables · CPC title
Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS] · CPC title
for networked environments · CPC title
the solution involving signatures · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.