Generating backup copies through interoperability between components of a data storage management system and appliances for data storage and deduplication

US11010258B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11010258-B2
Application numberUS-201816201897-A
CountryUS
Kind codeB2
Filing dateNov 27, 2018
Priority dateNov 27, 2018
Publication dateMay 18, 2021
Grant dateMay 18, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Illustrative storage manager and media agent are enhanced to interoperate with deduplication appliances. Advantages are realized when making secondary and tertiary copies and also when restoring from a deduplication appliance. Tiered indexing minimizes how much data is retained and stored at media agents. Tiered indexing enables media agents to efficiently extract needed information from deduplication appliances to make tertiary copies and to restore backed up copies. Interoperability techniques include media agents generating separate data streams to the deduplication appliance. Each data stream carries a different kind of data, e.g., payload data, metadata content, or high-level index information. On initial backup, the media agent instructs the deduplication appliance to deduplicate the payload data stream but not the other data streams, thus intelligently applying resources to data most likely to benefit from deduplication. For tertiary copies (copies of pre-existing copies at the deduplication appliance), the media agent avoids handling payload data altogether.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: by a media agent in communication with a deduplication appliance, performing a backup job for primary data that results in one or more secondary copies to be stored at the deduplication appliance, wherein the media agent executes on a first computing device comprising one or more processors and computer memory, wherein the deduplication appliance comprises one or more data storage devices and is capable of self-managed deduplication, and wherein performing the backup job comprises: by the media agent, generating a first data stream transmitted to the deduplication appliance, wherein the first data stream comprises first data chunks and does not comprise second data chunks and third data chunks, and wherein each first data chunk comprises payload data from the primary data being backed up in the backup job; by the media agent, generating a second data stream transmitted to the deduplication appliance, wherein the second data stream comprises the second data chunks and does not comprise the first data chunks and the third data chunks, wherein each second data chunk comprises metadata for the primary data being backed up and further wherein each second data chunk points to one or more first data chunks in the first data stream; by the media agent, generating a third data stream transmitted to the deduplication appliance, wherein the third data stream comprises the third data chunks and does not comprise the first data chunks and the second data chunks, wherein each third data chunk comprises index information and points to a corresponding second data chunk in the second data stream; by the media agent, instructing the deduplication appliance to apply deduplication to the first data chunks in the first data stream and to store deduplicated first data chunks at the deduplication appliance; by the media agent, instructing the deduplication appliance to store the second data chunks in the second data stream and atoll the third data chunks in the third data stream at the deduplication appliance without deduplication; and wherein, for transmission to the deduplication appliance, the media agent is configured not to deduplicate data chunks in the first data stream, the second data stream, and the third data stream. 2. The method of claim 1 wherein payload data in the one or more secondary copies are stored at the deduplication appliance in deduplicated form based on the self-managed deduplication. 3. The method of claim 1 further comprising: by the media agent, storing to an associated index at the first computing device: contents of the third data chunks which point to the corresponding second data chunks in the second data stream, and the metadata for the primary data from the second data chunks. 4. The method of claim 1 , wherein based on determining that the deduplication appliance is capable of deduplication, the media agent generates the first data stream, the second data stream, and the third data stream, and instructs the deduplication appliance to apply deduplication to the first data chunks, and further instructs the deduplication appliance to store without deduplication the second data chunks and the third data chunks. 5. The method of claim 1 , wherein a given second data chunk includes an offset of a corresponding first data chunk within the first data stream. 6. The method of claim 1 , wherein a given third data chunk includes an offset of the corresponding second data chunk within the second data stream. 7. The method of claim 1 , wherein a storage manager instructs the media agent to process the primary data being backed up in the backup job for further processing by and storage at the deduplication appliance as the one or more secondary copies. 8. The method of claim 1 , wherein a storage manager indicates to the media agent that the deduplication appliance is capable of deduplication and storage of the one or more secondary copies. 9. The method of claim 1 , wherein the media agent co-resides on the first computing device with a first data agent that accesses the primary data being backed up in the backup job; and wherein the first data agent in conjunction with the media agent process the primary data being backed up in the backup job for deduplication by and storage at the deduplication appliance as the one or more secondary copies. 10. The method of claim 1 further comprising: by the media agent, receiving instructions from a storage manager to restore a first secondary copy from the deduplication appliance in communication with the media agent, wherein the deduplication appliance deduplicated payload data in the first secondary copy when storing it; by the media agent, extracting from an associated index at the first computing device information about the first secondary copy, wherein the information comprises pointers to a second data chunk stored without deduplication at the deduplication appliance, wherein the second data chunk comprises metadata for the first secondary copy and further comprises pointers to one or more first data chunks stored with deduplication at the deduplication appliance, and wherein a given first data chunk comprises payload data of the first secondary copy; by the media agent, causing a fourth data stream from the deduplication appliance to transmit to the media agent, wherein the fourth data stream comprises the second data chunk; by the media agent, causing a fifth data stream from the deduplication appliance to transmit to the media agent, wherein the fifth data stream comprises the one or more first data chunks pointed to by the pointers in the second data chunk, wherein the deduplication appliance rehydrates at least some of the first data chunks before transmitting the fifth data stream to the media agent; and by the media agent processing the first data chunks and the metadata in the second data chunk to generate a sixth data stream transmitted to a data agent for restoring the secondary copy atoll into primary data. 11. A system for data storage management comprising: a media agent that executes on a first computing device comprising one or more processors and computer memory, wherein the media agent is in communication with a deduplication appliance and is configured to: during a backup job for primary data that results in one or more secondary copies, generate a plurality of data chunks, wherein the media agent configures each data chunk to comprise: (i) payload data, or (ii) metadata, or (iii) index information; generate a first data stream transmitted to the deduplication appliance, wherein the first data stream comprises first data chunks and does not comprise second data chunks and third data chunks, and wherein each first data chunk comprises payload data based on the primary data being backed up in the backup job; generate a second data stream transmitted to the deduplication appliance, wherein the second data stream comprises the second data chunks and does not comprise the first data chunks and the third data chunks, wherein each second data chunk comprises metadata for the primary data being backed up and further wherein each second data chunk points to one or more first data chunks in the first data stream; generate a third data stream transmitted to the deduplication appliance, wherein the third data stream comprises the third data chunks and does not comprise the first data chunks and the second data chunks, and wherein each third data chunk comprises index information and points to a corresponding second data chunk in the second data stream; instruct the deduplication appliance to apply deduplication to the first data chunks in the first data stream and to s

Assignees

Inventors

Classifications

  • Management of the backup or restore process · CPC title

  • using de-duplication of the data · CPC title

  • Redundancy elimination performed by the file system (error detection or correction of the data by redundancy in operations G06F11/14) · CPC title

  • Using snapshots, i.e. a logical point-in-time copy of the data · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11010258B2 cover?
Illustrative storage manager and media agent are enhanced to interoperate with deduplication appliances. Advantages are realized when making secondary and tertiary copies and also when restoring from a deduplication appliance. Tiered indexing minimizes how much data is retained and stored at media agents. Tiered indexing enables media agents to efficiently extract needed information from dedupl…
Who is the assignee on this patent?
Commvault Systems Inc
What technology area does this patent fall under?
Primary CPC classification G06F11/1453. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 18 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).