High availability distributed deduplicated storage system

US9665591B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9665591-B2
Application numberUS-201414152549-A
CountryUS
Kind codeB2
Filing dateJan 10, 2014
Priority dateJan 11, 2013
Publication dateMay 30, 2017
Grant dateMay 30, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A high availability distributed, deduplicated storage system according to certain embodiments is arranged to include multiple deduplication database media agents. The deduplication database media agents store signatures of data blocks stored in secondary storage. In addition, the deduplication database media agents are configured as failover deduplication database media agents in the event that one of the deduplication database media agents becomes unavailable.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method of performing a storage operation in a distributed, deduplicated storage system, the method comprising: during a period of availability of a first deduplication database computing device of a plurality of deduplication database computing devices and a second deduplication database computing device of the plurality of deduplication database computing devices, the plurality of deduplication database computing devices storing signature blocks corresponding to a plurality of data blocks stored in one or more secondary storage devices of secondary storage, the plurality of data blocks corresponding to data blocks received from primary storage, at least one signature block of the signature blocks comprising a signature of at least one data block of the plurality of data blocks, location information of the at least one data block in the one or more secondary storage devices, and a reference count value indicative of a quantity of one or more references in the secondary storage to the at least one data block, the first deduplication database computing device configured to store a first subset of the signature blocks based at least in part on a data block distribution policy and designated as a failover deduplication database computing device for the second deduplication database computing device that is configured to store, based at least in part on the data block distribution policy, a second subset of the signature blocks that is different from and does not overlap with the first subset of the signature blocks; receiving at a secondary storage computing device comprising a failover index and communicatively coupled to the plurality of deduplication database computing devices, a first set of one or more signatures corresponding to one or more data blocks stored in primary storage; identifying, based at least in part on the data block distribution policy, the second deduplication database computing device as the deduplication database computing device assigned to store the first set of one or more signatures; determining, based at least in part on a query of the failover index, that at least one signature of the first set of one or more signatures matches at least one signature of a second set of one or more signatures that was stored in the first deduplication database computing device during a previous period of unavailability of the second deduplication database computing device; querying the first deduplication database computing device for the at least one signature of the first set of one or more signatures; receiving from the first deduplication database computing device a location of a copy of a data block corresponding to the at least one signature of the first set of one or more signatures; and storing in the secondary storage the location of the copy of the data block corresponding to the at least one signature of the first set of one or more signatures. 2. The method of claim 1 , wherein following the period of unavailability, the method further comprises: determining that the second deduplication database computing device is available; and in response to said determining that the second deduplication database computing device is available, copying the second set of one or more signatures from the first deduplication database computing device to the second deduplication database computing device. 3. The method of claim 1 , wherein following the period of unavailability, the method further comprises: determining that the second deduplication database computing device is available; and in response to determining that the second deduplication database computing device is available, retaining the first set of one or more signatures at the first deduplication database computing device. 4. The method of claim 1 , wherein each deduplication database computing device of the plurality of deduplication database computing devices is identified as a failover deduplication database computing device to at least another one of the plurality of deduplication database computing devices. 5. The method of claim 1 , further comprising identifying a third deduplication database computing device as the failover deduplication database computing device for the first deduplication database computing device and identifying the second deduplication database computing device as the failover deduplication database computing device for the third deduplication database computing device. 6. A distributed deduplicated storage system, comprising: a plurality of deduplication database computing devices configured to store signature blocks corresponding to a plurality of data blocks stored in one or more secondary storage devices of secondary storage, the plurality of data blocks corresponding to data blocks received from primary storage, at least one signature block of the signature blocks comprising a signature of at least one data block of the plurality of data blocks, location information of the at least one data block in the one or more secondary storage devices, and a reference count value indicative of a quantity of one or more references in the secondary storage to the at least one data block, wherein a first deduplication database computing device of the plurality of deduplication database computing devices is configured to store a first subset of the signature blocks based at least in part on a data block distribution policy and is identified as a failover deduplication database computing device for a second deduplication database computing device of the plurality of deduplication database computing devices that is configured to store, based at least in part on the data block distribution policy, a second subset of the signature blocks that is different from and does not overlap with the first subset of the signature blocks; and a plurality of secondary storage computing devices communicatively coupled to the plurality of deduplication database computing devices, each of the plurality of secondary storage computing devices comprising one or more processors and storage, wherein following a period of unavailability of the second deduplication database computing device, at least one secondary storage computing device of the plurality of secondary storage computing devices further comprising a failover index is configured to: receive a first set of one or more signatures corresponding to one or more data blocks stored in the primary storage, identify, based at least in part on the data block distribution policy, the second deduplication database computing device as being assigned to store the first set of one or more signatures, determine, based at least in part on a query of the failover index, that at least one signature of the first set of one or more signatures matches at least one signature of a second set of one or more signatures that was stored in the first deduplication database computing device during the period of unavailability of the second deduplication database computing device, query the first deduplication database computing device for the at least one signature of the first set of one or more signatures, receive from the first deduplication database computing device a location of a copy of a data block corresponding to the at least one signature of the first set of one or more signatures, and store in the secondary storage the location of the copy of the data block corresponding to the at least one signature of the first set of one or more signatures. 7. The system of claim 6 , wherein following the period of unavailability, the at least one secondary storage computing device is further configured to: determine that the second deduplication database computing device is available; and copy the second set of one or

Assignees

Inventors

Classifications

  • Physics · mapped topic

  • Database-specific techniques · CPC title

  • De-duplication implemented within the file system, e.g. based on file segments (de-duplication techniques in storage systems for the management of data blocks G06F3/0641) · CPC title

  • using de-duplication of the data · CPC title

  • by selection of backup contents · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9665591B2 cover?
A high availability distributed, deduplicated storage system according to certain embodiments is arranged to include multiple deduplication database media agents. The deduplication database media agents store signatures of data blocks stored in secondary storage. In addition, the deduplication database media agents are configured as failover deduplication database media agents in the event that…
Who is the assignee on this patent?
Commvault Systems Inc
What technology area does this patent fall under?
Primary CPC classification G06F17/30156. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 30 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).