Synchronized data duplication
US-2015154220-A1 · Jun 4, 2015 · US
US9218375B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9218375-B2 |
| Application number | US-201313916458-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 12, 2013 |
| Priority date | Jun 13, 2012 |
| Publication date | Dec 22, 2015 |
| Grant date | Dec 22, 2015 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A storage system according to certain embodiments includes a client-side signature repository that includes information representative of a set of data blocks stored in primary storage. During storage operations of a client, the system can generate signatures corresponding to data blocks that are being stored in primary storage. The system can store the generated signatures in the client-side signature repository along with information regarding the location of the corresponding data block within primary storage. As additional instances of the data block are stored in primary storage, the system can store the location of the additional instances in the client-side signature repository.
Opening claim text (preview).
What is claimed is: 1. A method of maintaining a signature repository accessible by multiple client computing devices in a data storage system, the method comprising: tracking storage of a plurality of data units in a primary storage subsystem, the plurality of tracked data units corresponding to primary data generated by one or more applications executing on a plurality of client computing devices that form the primary storage subsystem, each data unit of the plurality of tracked data units forming at least a portion of at least one file stored in the primary storage subsystem, the primary data for each of the client computing devices stored in a primary data store associated with a respective client computing device, the primary storage subsystem in communication with a secondary storage subsystem that is separate from the primary storage subsystem and is configured to maintain secondary copies of at least some of the primary data; generating, by a signature agent executing on one or more processors in the primary storage subsystem, signatures corresponding to the plurality of tracked data units; and maintaining a signature repository including a signature block for at least each unique signature of the generated signatures, where each signature block comprises: the unique signature; and one or more data unit entries, each entry corresponding to a distinct data unit of the plurality of tracked data units and associated with the unique signature that is stored in the primary storage subsystem and that is generated by an application of the applications executing on a distinct client computing device, wherein each entry identifies a client computing device of the plurality of client computing devices that stores the corresponding distinct data unit, wherein at least one of the signature blocks includes at least a first entry indicating that a first primary data store associated with a first client computing device of the plurality of client computing devices stores a first data unit and that is associated with the unique signature and a second entry indicating that a second primary data store associated with a second client computing device of the plurality of client computing devices stores a second data unit that is associated with the unique signature, wherein the first data unit forms at least a portion of a first file stored in the first primary data store and the second data unit forms at least a portion of a second file stored in the second primary data store. 2. The method of claim 1 , wherein the first entry further includes location information identifying a location of the first data unit in the first primary data store and wherein the second entry further includes location information identifying a location of the data unit in the second primary data store. 3. The method of claim 1 , further comprising: receiving a query including a plurality of signatures; comparing the plurality of signatures included in the query with signature blocks in the signature repository to identify a first set of signatures received in the query that correspond to data units that reside in a primary data store of at least one client computing device of the plurality of client computing devices; and for at least some of the signatures in the first set of signatures, accessing the corresponding data units from the primary data store of the at least one client computing device. 4. The method of claim 3 , further comprising, for the signatures included in the query that are not included in the first set of signatures, accessing the corresponding data units from the secondary storage subsystem. 5. The method of claim 4 , wherein the plurality of signatures included in the query correspond to a set of data units which represent a backed up version of a set of the primary data that is to be restored to the first primary data store, and wherein at least some of the data units corresponding to the signatures in the first set of signatures are restored from the second client computing device. 6. The method of claim 1 , further comprising: in response to receipt of instructions to backup at least a subset of the primary data of the first primary data store, comparing a set of signatures corresponding to data units in the subset of the primary data with entries in the signature repository, the data units in the subset of the primary data comprising at least the first data unit; based at least in part on the comparing, identifying a set of matching data units that match the data units in the subset of the primary data and that reside in at least one other primary data store other than the first primary data store, the set of matching data units comprising at least the second data unit; and accessing the set of matching data units from the at least one other primary data store for retrieval as part of a backup set of data units. 7. The method of claim 6 , further comprising: based at least in part on the comparing, identifying a set of data units of the data units in the subset of the primary data that do not have a corresponding matching data unit; accessing the set of matching data units from the first primary data store; associating the set of matching data units accessed from the at least one other primary data store with the set of data units accessed from the first primary data store to generate the backup set of data units corresponding to the data units in the subset of the primary data; and communicating the backup set to the secondary storage subsystem. 8. The method of claim 1 , wherein the secondary storage subsystem comprises deduplicated data. 9. The method of claim 1 , wherein the primary data store of at least one the plurality of client computing devices comprises deduplicated data. 10. A storage system, comprising: a signature repository agent executing on one or more processors in a primary storage subsystem, the primary storage subsystem comprising: a plurality of client computing devices; and a plurality of data agents executing on the plurality of client computing devices, the plurality of data agents configured to track storage of a plurality of data units in the primary storage subsystem, the plurality of data units corresponding to primary data generated by one or more applications executing on the plurality of client computing devices, each data unit forming at least a portion of at least on file stored in the primary storage subsystem, the primary data for each of the client computing devices stored in a primary data store associated with a respective client computing device, and the primary storage subsystem in communication with a secondary storage subsystem that is separate from the primary storage subsystem and that is configured to maintain secondary copies of at least some of the primary data, and wherein the signature repository agent is configured to maintain a signature repository including a signature block for at least each unique signature generated by one or more signature agents, each signature block comprising: the unique signature; and one or more data unit entries, each entry corresponding to a distinct data unit associated with the unique signature that is stored in the primary storage subsystem and that is generated by an application of the applications executing on a distinct client computing device, wherein each entry identifies a client computing device of the plurality of client computing devices that stores the corresponding distinct data unit, wherein at least one of the signature blocks includes at least a first entry indicating that a first primary data store associated with a first client computing device of the plurality of client computing devices stores a f
Updates performed during online database operations; commit processing · CPC title
Indexing; Data structures therefor; Storage structures · CPC title
using management policies (point-in-time backing up or restoration of persistent data G06F11/1446; file migration policies for HSM systems G06F16/185) · CPC title
Management of the backup or restore process · CPC title
File or folder operations, e.g. details of user interfaces specifically adapted to file systems · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.