Multiple deduplication domains in network storage system

US10169365B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10169365-B2
Application numberUS-201615059160-A
CountryUS
Kind codeB2
Filing dateMar 2, 2016
Priority dateMar 2, 2016
Publication dateJan 1, 2019
Grant dateJan 1, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems, and computer programs are presented for deduplicating data in a storage device. One method includes an operation for identifying multiple deduplication domains for a storage system. A fingerprint index is created for each deduplication domain, where each data block stored in the storage system is associated with one of the plurality of deduplication domains. The method also includes operations for receiving a first data block the storage system, and for identifying a first deduplication domain from the plurality at of deduplication domains corresponding to the first data block. The first data block is deduplicated within the first deduplication domain utilizing a first fingerprint index associated with the first deduplication domain.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: identifying, by a processor, a plurality of deduplication domains for a storage system; creating, by the processor, a fingerprint index for each deduplication domain, wherein each data block stored in the storage system is associated with one of the plurality of deduplication domains; receiving, by the processor, a first data block at the storage system; identifying, by the processor, a first deduplication domain from the plurality of deduplication domains corresponding to the first data block; deduplicating, by the processor, the first data block within the first deduplication domain based on a first fingerprint index associated with the first deduplication domain; determining, by the processor, whether a size of a global fingerprint buffer exceeds a predetermined threshold, the global fingerprint buffer including space for storing fingerprint buffers of the deduplication domains; and based on a determination that the size of the global fingerprint buffer exceeds the predetermined threshold, selecting, by the processor, a deduplication domain of the plurality of deduplication domains and updating the fingerprint index of the selected deduplication domain with fingerprint mappings stored in the fingerprint buffer of the selected deduplication domain. 2. The method as recited in claim 1 , wherein deduplicating the first data block includes: determining a first fingerprint for the first data block; determining whether the first fingerprint is in the first fingerprint index; and storing the first data block in permanent storage based on a determination that the first fingerprint is not in the first fingerprint index. 3. The method as recited in claim 2 , wherein deduplicating the first data block further includes: based on a determination that the first fingerprint is not in the first fingerprint index, adding a fingerprint mapping to a first fingerprint buffer of the first deduplication domain kept in a random access memory (RAM), the fingerprint mapping including the first fingerprint and a location of the first data block in the permanent storage. 4. The method as recited in claim 3 , wherein selecting the deduplication domain for updating the fingerprint index further comprises selecting the deduplication domain with a highest ratio of a size of the fingerprint buffer in the global fingerprint buffer to a size of the fingerprint index. 5. The method as recited in claim 4 , wherein updating the fingerprint index further includes: merging fingerprint mappings in RAM associated with the selected deduplication domain with the fingerprint index to create a new fingerprint index; and freeing from RAM the fingerprint mappings in RAM associated with the selected deduplication domain. 6. The method as recited in claim 1 , wherein deduplicating the first data block further includes: determining a first fingerprint for the first data block; based on a determination that the first fingerprint is in the first fingerprint index, identifying a second data block associated with the first fingerprint in the first fingerprint index; and associating the first data block with the first fingerprint and the second data block, wherein the first data block is not added to the permanent storage. 7. The method as recited in claim 1 , wherein each deduplication domain is defined to track duplicate data blocks within the deduplication domain to store one copy of the duplicate data blocks, wherein data blocks existing in more than one deduplication domain will have a separate copy stored in permanent storage for each of the deduplication domains where the data blocks exist. 8. The method as recited in claim 1 , wherein identifying a first deduplication domain includes: determining a first volume of the storage system that includes the first data block; and determining the first deduplication domain as the deduplication domain associated with the first volume. 9. The method as recited in claim 1 , wherein identifying a plurality of deduplication domains for a storage system includes: providing a user interface for receiving input from an administrator of the storage system, the input identifying one of the deduplication domains for each volume in the storage system. 10. The method as recited in claim 1 , wherein the data blocks for the plurality of deduplication domains are stored intermingled within a permanent storage of the storage system. 11. A storage system comprising: a permanent storage for storing data blocks, wherein each data block stored in the permanent storage is associated with one of a plurality of deduplication domains; a memory for storing a fingerprint index for each deduplication domain; a processor to receive a first data block and identify a first deduplication domain from the plurality of deduplication domains corresponding to the first data block, wherein the processor is to deduplicate the first data block within the first deduplication domain based on a first fingerprint index associated with the first deduplication domain; a random access memory (RAM) for storing a global fingerprint buffer including a first fingerprint buffer on which is stored first fingerprint mappings, wherein the processor is further to: determine whether a size of the global fingerprint buffer exceeds a predetermined threshold; and based on a determination that the size of the global fingerprint buffer exceeds the predetermined threshold, select, by the processor, a deduplication domain of the plurality of deduplication domains and update the fingerprint index of the selected deduplication domain with fingerprint mappings stored in the fingerprint buffer of the selected deduplication domain. 12. The storage system as recited in claim 11 , wherein the processor is to deduplicate the first data block by determining a first fingerprint for the first data block, determining whether the first fingerprint is in the first fingerprint index, and storing the first data block in the permanent storage based on the first fingerprint not being in the first fingerprint index. 13. The storage system as recited in claim 12 , wherein based on the first fingerprint not being in the first fingerprint index, the processor is to add a fingerprint mapping to the global fingerprint buffer, the fingerprint mapping including the first fingerprint and a location of the first data block in the permanent storage. 14. The storage system as recited in claim 12 , wherein based on the first fingerprint not being in the first fingerprint index, the processor is to add a fingerprint mapping to a buffer kept in the memory, the fingerprint mapping including the first fingerprint and a location of the first data block in the permanent storage. 15. The storage system as recited in claim 11 , wherein the fingerprint index includes fingerprints of data blocks stored in the storage system, each fingerprint being mapped to one of the data blocks stored in the storage system. 16. A non-transitory computer-readable storage medium storing a computer program that when executed by a processor, cause the processor to: determine whether a size of a global fingerprint buffer exceeds a predetermined threshold, the global fingerprint buffer including a first fingerprint buffer storing first fingerprint mappings for a first fingerprint index and a second fingerprint buffer storing second fingerprint mappings for a second fingerprint index, the first fingerprint index being associated with a first deduplication domain and the second fingerprint index being associated with a second deduplication domain for

Assignees

Inventors

Classifications

  • De-duplication implemented within the file system, e.g. based on file segments (de-duplication techniques in storage systems for the management of data blocks G06F3/0641) · CPC title

  • Indexing; Data structures therefor; Storage structures · CPC title

  • Distributed file systems · CPC title

  • Physics · mapped topic

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10169365B2 cover?
Methods, systems, and computer programs are presented for deduplicating data in a storage device. One method includes an operation for identifying multiple deduplication domains for a storage system. A fingerprint index is created for each deduplication domain, where each data block stored in the storage system is associated with one of the plurality of deduplication domains. The method also in…
Who is the assignee on this patent?
Hewlett Packard Entpr Dev Lp
What technology area does this patent fall under?
Primary CPC classification G06F16/1748. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 01 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).