System and method for improved placement of blocks in a deduplication-erasure code environment

US9298386B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9298386-B2
Application numberUS-201313975152-A
CountryUS
Kind codeB2
Filing dateAug 23, 2013
Priority dateAug 23, 2013
Publication dateMar 29, 2016
Grant dateMar 29, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In one embodiment, a method includes receiving a block set including one or more blocks, generating a hash value for each block in the block set, and determining whether physical blocks stored on computer readable storage media are duplicates of any block in the block set. For each block in the block set having a duplicate, the method includes mapping the block to the duplicate when the duplicate is on one of the computer readable storage media that does not have any other block in the block set written and/or mapped thereto, and writing the block to one of the computer readable storage media that does not have any other block in the block set written and/or mapped thereto when the duplicate is stored on a computer readable storage media that has another block in the block set written and/or mapped thereto, and map the duplicate to the written block.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: receiving a block set comprising one or more blocks, wherein the block set is referenced by an application block; generating a hash value for each of the blocks in the block set; determining whether physical blocks stored on computer readable storage media are duplicates of any of the blocks in the block set, wherein a duplicate physical block is a physical block that has a hash value matching a generated hash value; and for each block in the block set that has a duplicate thereof: map the block to the duplicate thereof when the duplicate is on one of the computer readable storage media that does not have any other block in the block set written and/or mapped thereto, and write the block to one of the computer readable storage media that does not have any other block in the block set written and/or mapped thereto when the duplicate of the block is stored on a computer readable storage media that has another block in the block set written and/or mapped thereto, and map the duplicate to the written block. 2. The method of claim 1 , further comprising: calculating an affinity value for each block in the block set that does not have a duplicate thereof; and assigning each block in the block set that does not have a duplicate thereof to one of the computer readable storage media that does not have any other block in the block set written and/or mapped thereto based on the calculated affinity value. 3. The method of claim 2 , further comprising: comparing the calculated affinity value to an affinity value table, wherein the affinity value table comprises one or more affinity values associated with each of the computer readable storage media. 4. The method of claim 1 , wherein determining whether the stored physical blocks are duplicates of any of the blocks in the block set comprises referencing a mapping table. 5. The method of claim 4 , wherein the mapping table comprises information associated with the stored physical blocks, the information comprising: a status of each physical block, the status indicating whether the physical block is allocated or free, a hash value of each physical block, a description of any application blocks referring to each physical block, a storage medium identifier (ID), and a physical block identifier (ID) for each physical block describing a location on the computer readable storage medium. 6. The method of claim 1 , wherein the computer readable storage media comprises hard disk media. 7. The method of claim 1 , wherein the blocks in the block set comprise at least one data block and at least one parity block. 8. A system, comprising: a plurality of computer readable storage media; and a deduplication engine, the deduplication engine comprising a processor and logic integrated with and/or executable by the processor, the logic being configured to cause the processor to: receive a block set comprising one or more blocks; generate a hash value for each of the blocks in the block set; determine whether physical blocks stored on computer readable storage media are duplicates of any of the blocks in the block set, wherein a duplicate physical block is a physical block that has a hash value matching a generated hash value; for each block in the block set that does not have a duplicate thereof: write the block to one of the computer readable storage media that does not have any other block in the block set written and/or mapped thereto; and for each block in the block set that has a duplicate thereof: map the block to the duplicate thereof when the duplicate is on one of the computer readable storage media that does not have any other block in the block set written and/or mapped thereto, and write the block to one of the computer readable storage media that does not have any other block in the block set written and/or mapped thereto when the duplicate of the block is stored on a computer readable storage media that has another block in the block set written and/or mapped thereto, and map the duplicate to the written block. 9. The system of claim 8 , wherein the logic is further configured to: calculate an affinity value for each block in the block set that does not have a duplicate thereof; and assign the block to the one of the computer readable storage media that does not have any block in the block set written and/or mapped thereto based on the calculated affinity value. 10. The system of claim 8 , wherein the logic is further configured to reference a mapping table to compare hash values of stored physical blocks with the generated hash values. 11. The system of claim 10 , wherein the mapping table comprises information associated with the stored physical blocks, the information comprising: a status of each physical block, the status indicating whether the physical block is allocated or free, a hash value of each physical block, a description of any application blocks referring to each physical block, a storage medium identifier (ID), and a physical block identifier (ID) for each physical block describing a location on the computer readable storage medium. 12. The system of claim 8 , wherein the computer readable storage media comprises hard disk media. 13. The system of claim 8 , wherein the blocks in the block set comprise at least one data block and at least one parity block. 14. The system of claim 8 , wherein the block set is referenced by an application block. 15. A method comprising: obtaining information about a block set, wherein the information comprises one or more data blocks and one or more parity blocks in the block set; generating a hash value for each of the blocks in the block set; creating a non-duplicate list, a duplicate list, and a mapping list; determining whether physical blocks stored on computer readable storage media are duplicates of any of the blocks in the block set, wherein a duplicate physical block is a physical block that has a hash value matching a generated hash value; for each block in the block set that does not have a duplicate physical block: add the block to the non-duplicate list; for each block in the block set that has a duplicate physical block: add the duplicate physical block and the computer readable storage medium on which the duplicate physical block is stored to the mapping list, determine whether the computer readable storage medium on which the duplicate physical block is stored is in the duplicate list, and in response to determining that the computer readable storage medium on which the duplicate physical block is stored is not in the duplicate list, add the block, the duplicate physical block and the computer readable storage medium on which the duplicate physical block is stored to the duplicate list; and outputting the non-duplicate list and the duplicate list. 16. The method of claim 15 , wherein for each block in the non-duplicate list, the method further comprises: calculating an affinity value for the block; comparing the calculated affinity value to an affinity value table, wherein the affinity value table comprises one or more affinity values associated with each of the computer readable storage media; and assigning the block to a free physical block stored on one of the computer readable storage media based on the calculated affinity value, wherein the one of the computer readable storage media that does not have any other block in the block set written and/or mapped thereto. 17. The method of claim 16 , wherein for each block in the non-duplicate list, the method further co

Assignees

Inventors

Classifications

  • Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS] · CPC title

  • G06F3/0641Primary

    De-duplication techniques · CPC title

  • in relation to data integrity, e.g. data losses, bit errors · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9298386B2 cover?
In one embodiment, a method includes receiving a block set including one or more blocks, generating a hash value for each block in the block set, and determining whether physical blocks stored on computer readable storage media are duplicates of any block in the block set. For each block in the block set having a duplicate, the method includes mapping the block to the duplicate when the duplica…
Who is the assignee on this patent?
Globalfoundries Inc
What technology area does this patent fall under?
Primary CPC classification G06F3/0641. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 29 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).