Efficient deduplication database validation

US9639274B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9639274-B2
Application numberUS-201514686038-A
CountryUS
Kind codeB2
Filing dateApr 14, 2015
Priority dateApr 14, 2015
Publication dateMay 2, 2017
Grant dateMay 2, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

According to certain aspects, a method can include receiving an indication that a restoration of a deduplication database using a secondary copy of a file associated with a secondary copy job is complete; retrieving a first data fingerprint from a data storage database, wherein the first data fingerprint is associated with the secondary copy job used to restore the deduplication database; retrieving a second data fingerprint from a deduplication database media agent, wherein the second data fingerprint is associated with the secondary copy job used to restore the deduplication database; comparing the first data fingerprint with the second data fingerprint to determine whether the first data fingerprint and the second data fingerprint match; and transmitting an instruction to the deduplication database media agent to rebuild the restored deduplication database in response to a determination that the first data fingerprint and the second data fingerprint do not match.

First claim

Opening claim text (preview).

What is claimed is: 1. A networked information management system configured to validate a deduplication database, the networked information management system comprising: a deduplication database including information about a set of deduplication data blocks, wherein the deduplication data blocks are used to create deduplicated secondary copies on one or more secondary storage devices in the information management system; a data storage database associated with a storage management computer, wherein the data storage database comprises a first data fingerprint corresponding to a secondary copy job, wherein the first data fingerprint comprises an indication of a count of unique data blocks stored in a first secondary copy of a first file and an indication of a count of references stored in the first secondary copy of the first file that reference data blocks stored in other secondary copies, wherein execution of the secondary copy job resulted in generation of the first secondary copy of the first file, the first secondary copy of the first file and a secondary copy of the deduplication database residing in the one or more secondary storage devices; and one or more computing devices each having one or more hardware processors and configured to: if the deduplication database is determined to be invalid, restore the deduplication database using the secondary copy of the deduplication database; retrieve the first data fingerprint from the data storage database; generate a second data fingerprint using the restored version of the deduplication database; compare the first data fingerprint with the second data fingerprint; and rebuild the restored version of the deduplication database using the first secondary copy of the first file in response to a determination that the first data fingerprint and the second data fingerprint do not match. 2. The networked information management system of claim 1 , wherein the first data fingerprint further comprises at least one of an identification of a total size of the unique data blocks or an identification of a total size of the references. 3. The networked information management system of claim 2 , wherein the one or more computing devices are further configured to: generate, at a time that the secondary copy of the deduplication database is created, the first data fingerprint; and transmit the first data fingerprint to the storage management computer. 4. The networked information management system of claim 2 , wherein the second data fingerprint comprises an indication of a second count of unique data blocks stored in the first secondary copy of the first file, and wherein the one or more computing devices are further configured to: compare the count of the unique data blocks stored in the first secondary copy of the first file with the second count of the unique data blocks stored in the first secondary copy of the first file; and rebuild the restored version of the deduplication database in response to a determination that the count of the unique data blocks stored in the first secondary copy of the first file and the second count of the unique data blocks stored in the first secondary copy of the first file do not match. 5. The networked information management system of claim 1 , wherein the secondary copy of the deduplication database is a backup copy of the deduplication database. 6. A computer-implemented method for validating a deduplication database, the computer-implemented method comprising: retrieving, in response to a restoration of the deduplication database using a secondary copy of the deduplication database, a first data fingerprint from a data storage database, wherein the first data fingerprint corresponds with a secondary copy job, wherein the first data fingerprint comprises an indication of a count of unique data blocks stored in a first secondary copy of a first file and an indication of a count of references stored in the first secondary copy of the first file that reference data blocks stored in other secondary copies, wherein execution of the secondary copy job resulted in generation of the first secondary copy of the first file; generating a second data fingerprint using the restored version of the deduplication database, wherein the second data fingerprint corresponds with the secondary copy of the deduplication database; comparing the first data fingerprint with the second data fingerprint to determine whether the first data fingerprint and the second data fingerprint match; and rebuilding the restored version of the deduplication database using the first secondary copy of the first file in response to a determination that the first data fingerprint and the second data fingerprint do not match. 7. The computer-implemented method of claim 6 , wherein the first data fingerprint further comprises at least one of an identification of a total size of the unique data blocks or an identification of a total size of the references. 8. The computer-implemented method of claim 7 , further comprising: generating, at a time that the secondary copy of the deduplication database is created, the first data fingerprint; and transmitting the first data fingerprint to a data storage computer for storage in the data storage database. 9. The computer-implemented method of claim 7 , wherein the second data fingerprint comprises an indication of a second count of unique data blocks stored in the first secondary copy of the first file. 10. The computer-implemented method of claim 9 , wherein comparing the first data fingerprint with the second data fingerprint further comprises: comparing the count of the unique data blocks stored in the first secondary copy of the first file with the second count of the unique data blocks stored in the first secondary copy of the first file; and rebuilding the restored version of the deduplication database in response to a determination that the count of the unique data blocks stored in the first secondary copy of the first file and the second count of the unique data blocks stored in the first secondary copy of the first file do not match. 11. The computer-implemented method of claim 6 , wherein the secondary copy of the deduplication database is a backup copy of the deduplication database. 12. The computer-implemented method of claim 6 , wherein the deduplication database comprises information about a set of deduplication data blocks, and wherein the deduplication data blocks are used to create deduplicated secondary copies on one or more secondary storage devices in an information management system. 13. The computer-implemented method of claim 6 , further comprising restoring, if the deduplication database is determined to be invalid, the deduplication database using the secondary copy of the deduplication database. 14. A networked information management system configured to validate a deduplication database, the networked information management system comprising: a storage manager comprising a storage manager database, wherein the storage manager database comprises a plurality of data fingerprints; a deduplication database media agent comprising an electronically stored deduplication database; and a media agent comprising computer hardware configured to: receive an indication that a restoration of the deduplication database using a secondary copy of the deduplication database is complete; retrieve a first data fingerprint in the plurality of data fingerprints from the storage manager database, wherein the first data fingerprint is associated with a secondary copy job used to restore the deduplication database, wherein the first data fingerp

Assignees

Inventors

Classifications

  • Database-specific techniques · CPC title

  • Backup restoration techniques · CPC title

  • Error or fault detection not based on redundancy (power supply failures G06F1/30; network fault management H04L41/06) · CPC title

  • the solution involving signatures · CPC title

  • using de-duplication of the data · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9639274B2 cover?
According to certain aspects, a method can include receiving an indication that a restoration of a deduplication database using a secondary copy of a file associated with a secondary copy job is complete; retrieving a first data fingerprint from a data storage database, wherein the first data fingerprint is associated with the secondary copy job used to restore the deduplication database; retri…
Who is the assignee on this patent?
Commvault Systems Inc
What technology area does this patent fall under?
Primary CPC classification G06F11/1453. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 02 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).