Using multiple streams with network data management protocol to improve performance and granularity of backup and restore operations from/to a file server

US11470152B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11470152-B2
Application numberUS-202117216383-A
CountryUS
Kind codeB2
Filing dateMar 29, 2021
Priority dateMar 10, 2020
Publication dateOct 11, 2022
Grant dateOct 11, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Multiple substantially concurrent data streams with NDMP protocol improve robustness, performance, and granularity of backup and restore operations from/to a filer. NDMP data streams are initially allocated based on inventorying the root level of each filer volume. A best effort to balance the multiple NDMP data streams allocates them based on data amounts used in each volume. Orphaned files are also collected and backed up. Subsequent full backup jobs leverage a proprietary index generated in preceding full backup jobs to obtain better performance and to better balance the NDMP data streams by creating substantially co-equal groupings of source data. The index comprises granular information which is not available from querying the filer. The size of each individual backup copy from a preceding full backup job and/or the size of subtending subdirectories or individual backed up files therein is used by later backup jobs to fine tune NDMP data stream allocation.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: by a first computing device comprising one or more hardware processors and computer memory for executing at least a backup process: using network data management protocol (NDMP) to communicate with a network-attached storage, which is distinct from the first computing device; for a first full backup job of first data stored on data storage volumes of the network-attached storage, obtaining, from the network-attached storage, information about an amount of data storage used in each of the data storage volumes, and information about directories at a root level of each of the data storage volumes; instructing the network-attached storage to take and store a snapshot corresponding to each of the data storage volumes; allocating a first number of one or more NDMP data streams to run concurrently between each snapshot at the network-attached storage and the backup process at the first computing device, wherein each allocated first number is based on the amount of data storage used in each snapshot's corresponding data storage volume as obtained from the network-attached storage; in the first full backup job of the first data, generating a plurality of individual first backup copies, wherein each individual first backup copy among the plurality of individual first backup copies is generated from a respective directory among the directories using one or more of the allocated first number of one or more NDMP data streams; and populating an index stored at the first computing device, wherein the index comprises, for each individual first backup copy, a size thereof and a location thereof on one or more storage resources that are distinct from the network-attached storage. 2. The method of claim 1 further comprising: for a second full backup job of the first data, obtaining from the index a size of each individual first backup copy generated in the first full backup job; instructing the network-attached storage to take and store a second snapshot corresponding to each of the data storage volumes; allocating a second number of one or more NDMP data streams to run concurrently between each second snapshot and the backup process, wherein each allocated second number is based on the size of each individual first backup copy generated in the first full backup job as obtained from the index; and in the second full backup job, generating a plurality of individual second backup copies, each individual second backup copy corresponding to a respective directory among the directories, wherein each individual second backup copy is generated by one or more of the allocated second number of one or more NDMP data streams. 3. The method of claim 2 , wherein for at least one individual second backup copy, the allocated second number of one or more NDMP data streams differs from an allocated first number of one or more NDMP data streams used to generate an individual first backup copy corresponding to a same directory. 4. The method of claim 2 , wherein at least one allocated second number of NDMP data streams used in the second full backup job differs from a corresponding allocated first number of NDMP data streams used in the first full backup job. 5. The method of claim 2 , wherein at least one individual second backup copy generated in the second full backup job is based on a grouping of one or more individual data files present at the root level of one of the data storage volumes; and further comprising: obtaining from the index a data size for each of the one or more individual data files in the grouping. 6. The method of claim 1 , wherein at least one individual first backup copy is generated in the first full backup job is based on a grouping of one or more individual data files present at the root level of one of the data storage volumes. 7. The method of claim 1 , wherein at least one NDMP data stream is allocated to each snapshot up to a threshold maximum number of NDMP data streams. 8. The method of claim 1 , wherein the index comprises an association between the plurality of individual first backup copies and the first full backup job. 9. The method of claim 1 , wherein the first computing device allocates more NDMP data streams to a first snapshot corresponding to a first data storage volume with a larger amount of used data storage than to a second snapshot corresponding to an other data storage volume with a lower amount of used data storage. 10. The method of claim 1 , wherein the first number of one or more NDMP data streams allocated to a given snapshot is based on an amount of data stored in a data storage volume corresponding to the given snapshot as a proportion of a total amount of the first data being backed up. 11. The method of claim 1 , wherein each data storage node in a cluster configuration of the network-attached storage comprises some of the first data, and wherein each data storage node is allocated at least one NDMP data stream to run concurrently with other NDMP data streams in the first full backup job. 12. The method of claim 1 , wherein each snapshot is allocated at least one NDMP data stream to run concurrently with other NDMP data streams in the first full backup job. 13. The method of claim 1 , wherein a total number of NDMP data streams used by the first full backup job is determined by an intake capacity at the one or more storage resources. 14. The method of claim 1 , further comprising: using the index to locate, among the plurality of individual first backup copies, one of the individual first backup copies at the one or more storage resources; and restoring the one of the individual first backup copies from the one or more storage resources to the network-attached storage. 15. A system comprising: a first computing device comprising one or more hardware processors and computer memory; wherein, while executing a backup process, the first computing device is configured to: communicate with network-attached storage using network data management protocol (NDMP); for a first full backup job of first data stored on data storage volumes of the network-attached storage, obtain, from the network-attached storage, information about an amount of data storage used in each of the data storage volumes, and information about directories at a root level of each of the data storage volumes; instruct the network-attached storage to take and store a snapshot corresponding to each of the data storage volumes; allocate a first number of one or more NDMP data streams to run concurrently between each snapshot at the network-attached storage and the backup process at the first computing device, wherein each allocated first number is based on the amount of data storage used in each snapshot's corresponding data storage volume as obtained from the network-attached storage; in the first full backup job of the first data, generate a plurality of individual first backup copies, wherein each individual first backup copy among the plurality of individual first backup copies is generated from a respective directory among the directories using one or more of the allocated first number of one or more NDMP data streams; and populate an index stored at the first computing device, wherein the index comprises, for each individual first backup copy, a size thereof and a location thereof on one or more storage resources that are distinct from the network-attached storage. 16. The system of claim 15 , wherein, while executing the backup process, the first computing device is further configured to: for a second full backup job

Assignees

Inventors

Classifications

  • by selection of backup contents · CPC title

  • Using snapshots, i.e. a logical point-in-time copy of the data · CPC title

  • for networked environments · CPC title

  • Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11470152B2 cover?
Multiple substantially concurrent data streams with NDMP protocol improve robustness, performance, and granularity of backup and restore operations from/to a filer. NDMP data streams are initially allocated based on inventorying the root level of each filer volume. A best effort to balance the multiple NDMP data streams allocates them based on data amounts used in each volume. Orphaned files ar…
Who is the assignee on this patent?
Commvault Systems Inc
What technology area does this patent fall under?
Primary CPC classification H04L67/1095. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Oct 11 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).