Fast deduplication data verification
US-2016306820-A1 · Oct 20, 2016 · US
US9639274B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9639274-B2 |
| Application number | US-201514686038-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 14, 2015 |
| Priority date | Apr 14, 2015 |
| Publication date | May 2, 2017 |
| Grant date | May 2, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
According to certain aspects, a method can include receiving an indication that a restoration of a deduplication database using a secondary copy of a file associated with a secondary copy job is complete; retrieving a first data fingerprint from a data storage database, wherein the first data fingerprint is associated with the secondary copy job used to restore the deduplication database; retrieving a second data fingerprint from a deduplication database media agent, wherein the second data fingerprint is associated with the secondary copy job used to restore the deduplication database; comparing the first data fingerprint with the second data fingerprint to determine whether the first data fingerprint and the second data fingerprint match; and transmitting an instruction to the deduplication database media agent to rebuild the restored deduplication database in response to a determination that the first data fingerprint and the second data fingerprint do not match.
Opening claim text (preview).
What is claimed is: 1. A networked information management system configured to validate a deduplication database, the networked information management system comprising: a deduplication database including information about a set of deduplication data blocks, wherein the deduplication data blocks are used to create deduplicated secondary copies on one or more secondary storage devices in the information management system; a data storage database associated with a storage management computer, wherein the data storage database comprises a first data fingerprint corresponding to a secondary copy job, wherein the first data fingerprint comprises an indication of a count of unique data blocks stored in a first secondary copy of a first file and an indication of a count of references stored in the first secondary copy of the first file that reference data blocks stored in other secondary copies, wherein execution of the secondary copy job resulted in generation of the first secondary copy of the first file, the first secondary copy of the first file and a secondary copy of the deduplication database residing in the one or more secondary storage devices; and one or more computing devices each having one or more hardware processors and configured to: if the deduplication database is determined to be invalid, restore the deduplication database using the secondary copy of the deduplication database; retrieve the first data fingerprint from the data storage database; generate a second data fingerprint using the restored version of the deduplication database; compare the first data fingerprint with the second data fingerprint; and rebuild the restored version of the deduplication database using the first secondary copy of the first file in response to a determination that the first data fingerprint and the second data fingerprint do not match. 2. The networked information management system of claim 1 , wherein the first data fingerprint further comprises at least one of an identification of a total size of the unique data blocks or an identification of a total size of the references. 3. The networked information management system of claim 2 , wherein the one or more computing devices are further configured to: generate, at a time that the secondary copy of the deduplication database is created, the first data fingerprint; and transmit the first data fingerprint to the storage management computer. 4. The networked information management system of claim 2 , wherein the second data fingerprint comprises an indication of a second count of unique data blocks stored in the first secondary copy of the first file, and wherein the one or more computing devices are further configured to: compare the count of the unique data blocks stored in the first secondary copy of the first file with the second count of the unique data blocks stored in the first secondary copy of the first file; and rebuild the restored version of the deduplication database in response to a determination that the count of the unique data blocks stored in the first secondary copy of the first file and the second count of the unique data blocks stored in the first secondary copy of the first file do not match. 5. The networked information management system of claim 1 , wherein the secondary copy of the deduplication database is a backup copy of the deduplication database. 6. A computer-implemented method for validating a deduplication database, the computer-implemented method comprising: retrieving, in response to a restoration of the deduplication database using a secondary copy of the deduplication database, a first data fingerprint from a data storage database, wherein the first data fingerprint corresponds with a secondary copy job, wherein the first data fingerprint comprises an indication of a count of unique data blocks stored in a first secondary copy of a first file and an indication of a count of references stored in the first secondary copy of the first file that reference data blocks stored in other secondary copies, wherein execution of the secondary copy job resulted in generation of the first secondary copy of the first file; generating a second data fingerprint using the restored version of the deduplication database, wherein the second data fingerprint corresponds with the secondary copy of the deduplication database; comparing the first data fingerprint with the second data fingerprint to determine whether the first data fingerprint and the second data fingerprint match; and rebuilding the restored version of the deduplication database using the first secondary copy of the first file in response to a determination that the first data fingerprint and the second data fingerprint do not match. 7. The computer-implemented method of claim 6 , wherein the first data fingerprint further comprises at least one of an identification of a total size of the unique data blocks or an identification of a total size of the references. 8. The computer-implemented method of claim 7 , further comprising: generating, at a time that the secondary copy of the deduplication database is created, the first data fingerprint; and transmitting the first data fingerprint to a data storage computer for storage in the data storage database. 9. The computer-implemented method of claim 7 , wherein the second data fingerprint comprises an indication of a second count of unique data blocks stored in the first secondary copy of the first file. 10. The computer-implemented method of claim 9 , wherein comparing the first data fingerprint with the second data fingerprint further comprises: comparing the count of the unique data blocks stored in the first secondary copy of the first file with the second count of the unique data blocks stored in the first secondary copy of the first file; and rebuilding the restored version of the deduplication database in response to a determination that the count of the unique data blocks stored in the first secondary copy of the first file and the second count of the unique data blocks stored in the first secondary copy of the first file do not match. 11. The computer-implemented method of claim 6 , wherein the secondary copy of the deduplication database is a backup copy of the deduplication database. 12. The computer-implemented method of claim 6 , wherein the deduplication database comprises information about a set of deduplication data blocks, and wherein the deduplication data blocks are used to create deduplicated secondary copies on one or more secondary storage devices in an information management system. 13. The computer-implemented method of claim 6 , further comprising restoring, if the deduplication database is determined to be invalid, the deduplication database using the secondary copy of the deduplication database. 14. A networked information management system configured to validate a deduplication database, the networked information management system comprising: a storage manager comprising a storage manager database, wherein the storage manager database comprises a plurality of data fingerprints; a deduplication database media agent comprising an electronically stored deduplication database; and a media agent comprising computer hardware configured to: receive an indication that a restoration of the deduplication database using a secondary copy of the deduplication database is complete; retrieve a first data fingerprint in the plurality of data fingerprints from the storage manager database, wherein the first data fingerprint is associated with a secondary copy job used to restore the deduplication database, wherein the first data fingerp
Database-specific techniques · CPC title
Backup restoration techniques · CPC title
Error or fault detection not based on redundancy (power supply failures G06F1/30; network fault management H04L41/06) · CPC title
the solution involving signatures · CPC title
using de-duplication of the data · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.