Push-based piggyback system for source-driven logical replication in a storage environment
US-8930311-B1 · Jan 6, 2015 · US
US9916100B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9916100-B2 |
| Application number | US-201414587419-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 31, 2014 |
| Priority date | Dec 14, 2012 |
| Publication date | Mar 13, 2018 |
| Grant date | Mar 13, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The disclosed techniques enable push-based piggybacking of a source-driven logical replication system. Logical replication of a data set (e.g., a snapshot) from a source node to a destination node can be achieved from a source-driven system while preserving the effects of storage efficiency operations (deduplication) applied at the source node. However, if missing data extents are detected at the destination, the destination has an extent pulling problem as the destination may not have knowledge of the physical layout on the source-side and/or mechanisms for requesting extents. The techniques overcome the extent pulling problem in a source-driven replication system by introducing specific protocols for obtaining missing extents within an existing replication environment by piggybacking data pushes from the source.
Opening claim text (preview).
What is claimed is: 1. A non-transitory computer readable storage medium having instructions stored thereon, which when executed by one or more processors of a machine, causes the machine to: identify, at a first node, a missing extent of a plurality of extents associated with a replicated data set of an original data set at a second node, wherein the replicated data set is reconstructed at the first node; maintain a table comprising entries mapping missing extent identifiers of missing extents to request indications as to whether the missing extents have already been requested by push inquiry responses sent by the first node to the second node; sort the entries mapping missing extent identifiers to request indications based upon virtual volume block numbers associated with the missing extent identifiers; receive a push inquiry from the second node; send, from the first node for delivery to the second node, a push inquiry response requesting the missing extent, the push inquiry response for causing the second node to initiate a push of data of the missing extent based upon the table indicating that the missing extent has not yet been requested; and receive, at the first node, a data stream pushed from the second node, the data stream comprising both data of the missing extent requested by the first node and a piggyback flag indicating that the missing extent was requested by the first node through the push inquiry response. 2. The non-transitory computer readable storage medium of claim 1 wherein the instructions, when executed by the one or more processors, further cause the machine to: create an entry mapping a missing extent identifier of the missing extent to a request indication that the missing extent has not yet been requested. 3. The non-transitory computer readable storage medium of claim 2 , wherein the instructions, when executed by the one or more processors, further cause the machine to update the request indication within the entry to indicate that the missing extent was requested through the push inquiry response based upon sending the push inquiry response. 4. The non-transitory computer readable storage medium of claim 1 wherein the instructions, when executed by the one or more processors, further cause the machine to: determine, at the first node, if the missing extent has already been requested via a prior push inquiry response by looking up the missing extent in the table. 5. The non-transitory computer readable storage medium of claim 1 wherein the instructions, when executed by the one or more processors, further cause the machine to: query the table to identify a second entry for a second missing extent; determine that the second entry comprises a second request indication that the second missing extent has already been requested; and determine that the second missing extent is not to be requested through the push inquiry response based upon the second missing extent already being request. 6. The non-transitory computer readable storage medium of claim 4 , wherein the missing extent identifiers are sorted in the table for searching in O(log N). 7. The non-transitory computer readable storage medium of claim 1 , wherein the replicated data set is deduplicated at the second node, and when the replicated data set is reconstructed at the first node, the replicated data set maintains the deduplication. 8. The non-transitory computer readable storage medium of claim 1 wherein the instructions, when executed by the one or more processors, further cause the machine to: receive, at the first node, a request for the missing block from a destination module. 9. The non-transitory computer readable storage medium of claim 1 , wherein the data set comprises a point-in-time image. 10. A method comprising: receiving, at a first node of a source-driven replication system, a first data stream including a plurality of extents associated with a data set; identifying, at the first node, a missing extent of the plurality of extents associated with the data set; maintaining a table comprising entries mapping missing extent identifiers of missing extents to request indications as to whether the missing extents have already been requested by push inquiry responses sent by the first node to the second node; sorting the entries mapping missing extent identifiers to request indications based upon virtual volume block numbers associated with the missing extent identifiers; receiving a push inquiry from a second node; sending, from the first node for delivery to the second node, a push inquiry response requesting the missing extent, the push inquiry response for causing the second node to initiate a push of data of the missing extent based upon the table indicating that the missing extent has not yet been requested; and receiving, at the first node, a second data stream including both data of the missing extent requested by the first node and a piggyback flag indicating that the missing extent was requested by the first node through the push inquiry response. 11. The method of claim 10 , wherein the first data stream does not include the piggyback flag and the second data stream includes the piggyback flag indicating that the missing extent was indicated as missing in the push inquiry response. 12. The method of claim 10 , further comprising: querying the table to identify a second entry for a second missing extent; determining that the second entry comprises a second request indication that the second missing extent has already been requested; and determining that the second missing extent is not to be requested through the push inquiry response based upon the second missing extent already being request. 13. The method of claim 10 , further comprising: determining, at the first node, if the missing extent has already been requested via a prior push inquiry response by looking up the missing extent in the table. 14. A first node, comprising: a memory having stored thereon instructions for performing a method; and a processor coupled to the memory, the processor configured to execute the instructions to cause the processor to: receive, from a second node, a plurality of extents associated with a replicated data set, wherein the replicated data set is reconstructed at the first node; identify a missing extent of the plurality of extents associated with the replicated data set; maintain a table comprising entries mapping missing extent identifiers of missing extents to request indications as to whether the missing extents have already been requested by push inquiry responses sent by the first node to the second node; sort the entries mapping missing extent identifiers to request indications based upon virtual volume block numbers associated with the missing extent identifiers; receive a push inquiry from the second node; send, for delivery to the second node, a push inquiry response requesting the missing extent, wherein the push inquiry response is for causing the second node to initiate a push of data of the missing extent based upon the table indicating that the missing extent has not yet been requested; and receive a data stream pushed from the second node, the data stream comprising both data of the missing extent requested by the first node and a piggyback flag indicating that the missing extent was requested by the first node through the push inquiry response. 15. The first node of claim 14 , wherein the instructions to cause the processor to create an entry mapping a missing extent identifier of the missing extent to a request indication that the missing extent has not
Replication mechanisms · CPC title
in relation to data integrity, e.g. data losses, bit errors · CPC title
implemented as replicated file system · CPC title
De-duplication implemented within the file system, e.g. based on file segments (de-duplication techniques in storage systems for the management of data blocks G06F3/0641) · CPC title
Techniques for file synchronisation in file systems · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.