Dedicated client-side signature generator in a networked storage system

US9218375B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9218375-B2
Application numberUS-201313916458-A
CountryUS
Kind codeB2
Filing dateJun 12, 2013
Priority dateJun 13, 2012
Publication dateDec 22, 2015
Grant dateDec 22, 2015

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A storage system according to certain embodiments includes a client-side signature repository that includes information representative of a set of data blocks stored in primary storage. During storage operations of a client, the system can generate signatures corresponding to data blocks that are being stored in primary storage. The system can store the generated signatures in the client-side signature repository along with information regarding the location of the corresponding data block within primary storage. As additional instances of the data block are stored in primary storage, the system can store the location of the additional instances in the client-side signature repository.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of maintaining a signature repository accessible by multiple client computing devices in a data storage system, the method comprising: tracking storage of a plurality of data units in a primary storage subsystem, the plurality of tracked data units corresponding to primary data generated by one or more applications executing on a plurality of client computing devices that form the primary storage subsystem, each data unit of the plurality of tracked data units forming at least a portion of at least one file stored in the primary storage subsystem, the primary data for each of the client computing devices stored in a primary data store associated with a respective client computing device, the primary storage subsystem in communication with a secondary storage subsystem that is separate from the primary storage subsystem and is configured to maintain secondary copies of at least some of the primary data; generating, by a signature agent executing on one or more processors in the primary storage subsystem, signatures corresponding to the plurality of tracked data units; and maintaining a signature repository including a signature block for at least each unique signature of the generated signatures, where each signature block comprises: the unique signature; and one or more data unit entries, each entry corresponding to a distinct data unit of the plurality of tracked data units and associated with the unique signature that is stored in the primary storage subsystem and that is generated by an application of the applications executing on a distinct client computing device, wherein each entry identifies a client computing device of the plurality of client computing devices that stores the corresponding distinct data unit, wherein at least one of the signature blocks includes at least a first entry indicating that a first primary data store associated with a first client computing device of the plurality of client computing devices stores a first data unit and that is associated with the unique signature and a second entry indicating that a second primary data store associated with a second client computing device of the plurality of client computing devices stores a second data unit that is associated with the unique signature, wherein the first data unit forms at least a portion of a first file stored in the first primary data store and the second data unit forms at least a portion of a second file stored in the second primary data store. 2. The method of claim 1 , wherein the first entry further includes location information identifying a location of the first data unit in the first primary data store and wherein the second entry further includes location information identifying a location of the data unit in the second primary data store. 3. The method of claim 1 , further comprising: receiving a query including a plurality of signatures; comparing the plurality of signatures included in the query with signature blocks in the signature repository to identify a first set of signatures received in the query that correspond to data units that reside in a primary data store of at least one client computing device of the plurality of client computing devices; and for at least some of the signatures in the first set of signatures, accessing the corresponding data units from the primary data store of the at least one client computing device. 4. The method of claim 3 , further comprising, for the signatures included in the query that are not included in the first set of signatures, accessing the corresponding data units from the secondary storage subsystem. 5. The method of claim 4 , wherein the plurality of signatures included in the query correspond to a set of data units which represent a backed up version of a set of the primary data that is to be restored to the first primary data store, and wherein at least some of the data units corresponding to the signatures in the first set of signatures are restored from the second client computing device. 6. The method of claim 1 , further comprising: in response to receipt of instructions to backup at least a subset of the primary data of the first primary data store, comparing a set of signatures corresponding to data units in the subset of the primary data with entries in the signature repository, the data units in the subset of the primary data comprising at least the first data unit; based at least in part on the comparing, identifying a set of matching data units that match the data units in the subset of the primary data and that reside in at least one other primary data store other than the first primary data store, the set of matching data units comprising at least the second data unit; and accessing the set of matching data units from the at least one other primary data store for retrieval as part of a backup set of data units. 7. The method of claim 6 , further comprising: based at least in part on the comparing, identifying a set of data units of the data units in the subset of the primary data that do not have a corresponding matching data unit; accessing the set of matching data units from the first primary data store; associating the set of matching data units accessed from the at least one other primary data store with the set of data units accessed from the first primary data store to generate the backup set of data units corresponding to the data units in the subset of the primary data; and communicating the backup set to the secondary storage subsystem. 8. The method of claim 1 , wherein the secondary storage subsystem comprises deduplicated data. 9. The method of claim 1 , wherein the primary data store of at least one the plurality of client computing devices comprises deduplicated data. 10. A storage system, comprising: a signature repository agent executing on one or more processors in a primary storage subsystem, the primary storage subsystem comprising: a plurality of client computing devices; and a plurality of data agents executing on the plurality of client computing devices, the plurality of data agents configured to track storage of a plurality of data units in the primary storage subsystem, the plurality of data units corresponding to primary data generated by one or more applications executing on the plurality of client computing devices, each data unit forming at least a portion of at least on file stored in the primary storage subsystem, the primary data for each of the client computing devices stored in a primary data store associated with a respective client computing device, and the primary storage subsystem in communication with a secondary storage subsystem that is separate from the primary storage subsystem and that is configured to maintain secondary copies of at least some of the primary data, and wherein the signature repository agent is configured to maintain a signature repository including a signature block for at least each unique signature generated by one or more signature agents, each signature block comprising: the unique signature; and one or more data unit entries, each entry corresponding to a distinct data unit associated with the unique signature that is stored in the primary storage subsystem and that is generated by an application of the applications executing on a distinct client computing device, wherein each entry identifies a client computing device of the plurality of client computing devices that stores the corresponding distinct data unit, wherein at least one of the signature blocks includes at least a first entry indicating that a first primary data store associated with a first client computing device of the plurality of client computing devices stores a f

Assignees

Inventors

Classifications

  • Updates performed during online database operations; commit processing · CPC title

  • Indexing; Data structures therefor; Storage structures · CPC title

  • using management policies (point-in-time backing up or restoration of persistent data G06F11/1446; file migration policies for HSM systems G06F16/185) · CPC title

  • Management of the backup or restore process · CPC title

  • File or folder operations, e.g. details of user interfaces specifically adapted to file systems · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9218375B2 cover?
A storage system according to certain embodiments includes a client-side signature repository that includes information representative of a set of data blocks stored in primary storage. During storage operations of a client, the system can generate signatures corresponding to data blocks that are being stored in primary storage. The system can store the generated signatures in the client-side s…
Who is the assignee on this patent?
Commvault Systems Inc
What technology area does this patent fall under?
Primary CPC classification G06F11/1458. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 22 2015 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).