Systems and methods for retaining and using data block signatures in data protection operations

US9239687B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9239687-B2
Application numberUS-201314040247-A
CountryUS
Kind codeB2
Filing dateSep 27, 2013
Priority dateSep 30, 2010
Publication dateJan 19, 2016
Grant dateJan 19, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system according to certain embodiments associates a signature value corresponding to a data block with one or more data blocks and a reference to the data block to form a signature/data word corresponding to the data block. The system further logically organizes the signature/data words into a plurality of files each comprising at least one signature/data word such that the signature values are embedded in the respective file. The system according to certain embodiments reads a previously stored signature value corresponding to a respective data block for sending from a backup storage system having at least one memory device to a secondary storage system. Based on an indication as to whether the data block is already stored on the secondary storage system, the system reads the data block from the at least one memory device for sending to the secondary storage system if the data block does not exist on the secondary storage system, wherein the signature value and not the data block is read from the at least one memory device if the data block exists on the secondary storage system.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of performing a copy operation, the method comprising: accessing with computer hardware a deduplication signature table containing a plurality of signatures corresponding to data blocks stored in at least one first storage device; using the deduplication signature table, performing, with computer hardware, a first deduplicated copy operation of a plurality of files from at least one second storage device to the first storage device thereby creating a deduplicated copy of the plurality of files in the first storage device; and with computer hardware and as part of a second deduplicated copy operation in which at least a subset of the plurality of data blocks which form the deduplicated copy are copied from the first storage device to at least one third storage device, for each respective data block in the subset: accessing from the first storage device a previously stored signature corresponding to the respective data block and which is stored separately from the deduplication signature table, the previously stored signature included in a first signature/data word of a plurality of signature/data words which is associated with the respective data block and is embedded in a file that includes others of the plurality of signature/data words associated with other data blocks in the subset, wherein first signature/data words of the plurality of signature/data words each include a respective signature and a respective actual data block copy stored physically or logically contiguously with respect to one another, and wherein second signature/data words of the plurality of signature/data words each include a respective signature and a respective reference to an actual data block copy stored physically or logically contiguously with respect to one another; transmitting the previously stored signature to the third storage device, wherein the previously stored signature is transmitted to the third storage device without re-generating the value of the previously stored signature using the respective data block; receiving a message indicating whether a copy of the respective data block is already stored on the third storage device; and if the message indicates that the respective data block is not already stored on the third storage device: accessing the respective data block from the first storage device; and transmitting the respective data block to the third storage device. 2. The method of claim 1 , wherein the respective data block itself is not read from the first storage device as part of the second deduplicated storage operation if the message indicates that the respective data already block exists on the third storage device. 3. The method of claim 1 , wherein each of the signature/data words includes one or more of the data block associated with the signature/data word or a reference to the data block associated with the signature/data word. 4. The method of claim 3 , further comprising logically organizing the signature/data words into files each including one or more signature/data words such that the signatures are embedded in the files. 5. The method of claim 1 , wherein the second deduplicated copy operation is completed using previously stored signatures that were created and stored on the first storage device prior to the second deduplicated copy operation. 6. The method of claim 1 , wherein the first deduplicated copy operation is from primary storage to secondary storage, and the second deduplicated copy operation is an auxiliary copy operation within secondary storage. 7. The method of claim 1 , wherein the deduplication signature table resides in a secondary storage system which includes the first storage device. 8. The method of claim 1 , wherein the size of the data blocks that form the deduplicated copy and that are stored on the first storage device is at least about 512 times larger than the size of corresponding signatures that are also stored on the first storage device. 9. The method of claim 1 , wherein the plurality of files are originally generated by at least one client computing device. 10. A system for performing a copy operation, the system comprising: a signature table containing a plurality of signatures corresponding to data blocks stored in at least one first storage device; and computer hardware in communication with the first storage device and also in communication with at least one second storage device and at least one third storage device, the computer hardware implementing a copy manager configured to: access the signature table and, using the signature table, perform a first deduplicated copy operation on a plurality of files from the second storage device to the first storage device, thereby creating a deduplicated copy of the plurality of files in the first storage device; as part of a second deduplicated copy operation in which at least a subset of the plurality of data blocks which form the copy are copied from the first storage device to the third storage device, for each respective data block in the subset: access from the first storage device a stored signature corresponding to the respective data block and which is stored separately from the deduplication signature table, the stored signature included in a first signature/data word of a plurality of signature/data words which is associated with the respective data block and is embedded in a file that includes others of the plurality of signature/data words associated with other data blocks in the subset, wherein first signature/data words of the plurality of signature/data words each include a respective signature and a respective actual data block copy stored physically or logically contiguously with respect to one another, and wherein second signature/data words of the plurality of signature/data words each include a respective signature and a respective reference to an actual data block copy stored physically or logically contiguously with respect to one another; transmit the stored signature to the third storage device, wherein the stored signature is transmitted to the third storage device without re-generating the value of the stored signature using the respective data block; receive a message indicating whether a copy of the respective data block is already stored on the third storage device; and if the message indicates that the respective data block is not already stored on the third storage device: access the respective data block from the first storage device, and transmit the respective data block to the third storage device. 11. The system of claim 10 , wherein the respective data block itself is not read from the first storage device as part of the second deduplicated storage operation if the message indicates that the respective data already block exists on the third storage device. 12. The system of claim 10 , wherein each of the signature/data words includes one or more of the data block associated with the signature/data word or a reference to the respective data block associated with the signature/data word. 13. The system of claim 12 , the copy manager further configured to logically organize the signature/data words into files each including one or more signature/data words such that the signatures are embedded in the files. 14. The system of claim 10 , wherein the second deduplicated copy operation is completed using previously stored signatures that were created and stored on the first storage device prior to the second deduplicated copy operation. 15. The system of claim 10 , wherein the first deduplicated copy operation is from primary storage to secondar

Assignees

Inventors

Classifications

  • Replication mechanisms · CPC title

  • in relation to data integrity, e.g. data losses, bit errors · CPC title

  • Erasing, e.g. deleting, data cleaning, moving of data to a wastebasket · CPC title

  • Improving I/O performance · CPC title

  • using de-duplication of the data · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9239687B2 cover?
A system according to certain embodiments associates a signature value corresponding to a data block with one or more data blocks and a reference to the data block to form a signature/data word corresponding to the data block. The system further logically organizes the signature/data words into a plurality of files each comprising at least one signature/data word such that the signature values …
Who is the assignee on this patent?
Commvault Systems Inc
What technology area does this patent fall under?
Primary CPC classification G06F11/1453. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 19 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).