High availability distributed deduplicated storage system

US9633033B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9633033-B2
Application numberUS-201414152509-A
CountryUS
Kind codeB2
Filing dateJan 10, 2014
Priority dateJan 11, 2013
Publication dateApr 25, 2017
Grant dateApr 25, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A high availability distributed, deduplicated storage system according to certain embodiments is arranged to include multiple deduplication database media agents. The deduplication database media agents store signatures of data blocks stored in secondary storage. In addition, the deduplication database media agents are configured as failover deduplication database media agents in the event that one of the deduplication database media agents becomes unavailable.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of performing a storage operation in a distributed, deduplicated storage system, comprising: receiving at a first secondary storage computing device of a plurality of secondary storage computing devices a request from a client computing device to backup a file comprising a plurality of data blocks and stored in primary storage, wherein a first deduplication database computing device of a plurality of deduplication database computing devices communicatively coupled to the first secondary storage computing device is configured to store a first subset of signature blocks based at least in part on a data block distribution policy and is designated as a failover deduplication database computing device for a second deduplication database computing device of the plurality of deduplication database computing devices that is configured to store, based at least in part on the data block distribution policy, a second subset of the signature blocks that is different from and does not overlap with the first subset of the signature blocks, wherein the plurality of deduplication database computing devices store the signature blocks corresponding to data blocks stored in secondary storage, wherein the data blocks stored in the secondary storage correspond to data blocks stored in primary storage, at least one signature block of the signature blocks comprising a signature of at least one data block of the plurality of data blocks, location information of the at least one data block in the secondary storage, and a reference count value indicative of a quantity of one or more references in the secondary storage to the at least one data block; in response to the request and using one or more processors, calculating a signature of a particular data block of the plurality of data blocks using a signature function; identifying the second deduplication database computing device as the deduplication database computing device assigned to store the signature of the particular data block; determining that the second deduplication database computing device is unavailable; and querying the first deduplication database computing device for the signature of the particular data block, the method further comprising at least one of: based at least in part on an indication from the first deduplication database computing device that the signature of the particular data block does not reside in the first deduplication database computing device, store the signature in a failover index, cause at least one storage device of the of the secondary storage to store a copy of the particular data block, and request the first deduplication database computing device to store the signature of the particular data block and a location of the copy of the particular data block, or based at least in part on an indication from the first deduplication database computing device that the signature of the particular data block resides in the first deduplication database computing device, cause at least one storage device of the of the secondary storage to store a reference to a copy of the particular data block that is stored in the secondary storage. 2. The method of claim 1 , wherein each deduplication database computing device is assigned to store a different group of unique signatures. 3. The method of claim 1 , wherein the first deduplication database computing device is assigned to store a first set of signatures and the second deduplication database computing device is assigned to store a second set of signatures different from the first set of signatures. 4. The method of claim 1 , wherein each deduplication database computing device of the plurality of deduplication database computing devices is identified as a failover deduplication database computing device to another one of the plurality of deduplication database computing devices. 5. The method of claim 1 , further comprising identifying a third deduplication database computing device as the failover deduplication database computing device for the first deduplication database computing device and identifying the second deduplication database computing device as the failover deduplication database computing device for the third deduplication database computing device. 6. The method of claim 1 , further comprising calculating a signature of each of the plurality of data blocks using the signature function. 7. A distributed deduplicated storage system, comprising: a plurality of deduplication database computing devices configured to store signature blocks corresponding to a plurality of data blocks stored in one or more storage devices of secondary storage, at least one signature block of the signature blocks comprising a signature of at least one data block of the plurality of data blocks, location information of the at least one data block in the one or more storage devices, and a reference count value indicative of a quantity of one or more references in the secondary storage to the at least one data block, wherein a first deduplication database computing device of the plurality of deduplication database computing devices is configured to store a first subset of the signature blocks based at least in part on a data block distribution policy and is designated as a failover deduplication database computing device for a second deduplication database computing device of the plurality of deduplication database computing devices that is configured to store, based at least in part on the data block distribution policy, a second subset of the signature blocks that is different from and does not overlap with the first subset of the signature blocks; and a plurality of secondary storage computing devices communicatively coupled to the plurality of deduplication database computing devices, each of the plurality of secondary storage computing devices comprising one or more processors and storage, wherein at least one secondary storage computing device of the plurality of secondary storage computing devices further comprises a failover index and is configured to: receive a request to backup a file comprising a plurality of data blocks and stored in primary storage, calculate a signature of a particular data block of the plurality of data blocks using a signature function, identify the second deduplication database computing device as the deduplication database computing device assigned to store the signature of the particular data block, determine that the second deduplication database computing device is unavailable, and query the first deduplication database computing device for the signature of the particular data block, wherein the at least one secondary storage computing device is further configured to at least one of: based at least in part on an indication from the first deduplication database computing device that the signature of the particular data block does not reside in the first deduplication database computing device, store the signature in the failover index, cause at least one storage device of the one or more storage devices of the secondary storage to store a copy of the particular data block, and request the first deduplication database computing device to store the signature of the particular data block and a location of the copy of the particular data block, or based at least in part on an indication from the first deduplication database computing device that the signature of the particular data block resides in the first deduplication database computing device, cause at least one storage device of the one or more storage devices of the secondary storage to store a reference to a copy of the particular data block that is stored in the secondary storage. 8. The system of claim 7 , wherein each deduplic

Assignees

Inventors

Classifications

  • Physics · mapped topic

  • using de-duplication of the data · CPC title

  • De-duplication implemented within the file system, e.g. based on file segments (de-duplication techniques in storage systems for the management of data blocks G06F3/0641) · CPC title

  • Using snapshots, i.e. a logical point-in-time copy of the data · CPC title

  • Database-specific techniques · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9633033B2 cover?
A high availability distributed, deduplicated storage system according to certain embodiments is arranged to include multiple deduplication database media agents. The deduplication database media agents store signatures of data blocks stored in secondary storage. In addition, the deduplication database media agents are configured as failover deduplication database media agents in the event that…
Who is the assignee on this patent?
Commvault Systems Inc
What technology area does this patent fall under?
Primary CPC classification G06F17/30156. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 25 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).