Push-based piggyback system for source-driven logical replication in a storage environment
US-8930311-B1 · Jan 6, 2015 · US
US10620852B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10620852-B2 |
| Application number | US-201715851895-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 22, 2017 |
| Priority date | Dec 14, 2012 |
| Publication date | Apr 14, 2020 |
| Grant date | Apr 14, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The disclosed techniques enable push-based piggybacking of a source-driven logical replication system. Logical replication of a data set (e.g., a snapshot) from a source node to a destination node can be achieved from a source-driven system while preserving the effects of storage efficiency operations (deduplication) applied at the source node. However, if missing data extents are detected at the destination, the destination has an extent pulling problem as the destination may not have knowledge of the physical layout on the source-side and/or mechanisms for requesting extents. The techniques overcome the extent pulling problem in a source-driven replication system by introducing specific protocols for obtaining missing extents within an existing replication environment by piggybacking data pushes from the source.
Opening claim text (preview).
The invention claimed is: 1. A non-transitory computer readable storage medium having instructions stored thereon, which when executed by one or more processors of a machine, causes the machine to: identify, at a first node, a missing extent associated with a replicated data set, of an original data set at a second node, to reconstructed at the first node; send, from the first node to the second node, a push inquiry response to cause the second node to send data of the missing extent based upon a determination that an extent dictionary, maintained by the first node to track previously requested missing extents such that the first node refrains from re-requesting missing extents tracked within the extent dictionary, lacks an entry for the missing extent; add a new entry into the extent dictionary to indicate that the missing extent has been requested through the push inquiry response; and receive, from the second node at the first node, a data stream comprising data of the missing extent and a piggyback flag indicating that the first node requested the missing extent through the push inquiry response, wherein the new entry marked as requested in the extent dictionary based upon receipt of the piggyback flag. 2. The non-transitory computer readable storage medium of claim 1 wherein the instructions, when executed by the one or more processors, further cause the machine to: reconstruct, at the first node, the replicated data set from a plurality of extents in response to reception of the missing extent. 3. The non-transitory computer readable storage medium of claim 1 , wherein the data stream further includes the piggyback flag indicating that the missing extent was indicated as missing in the push inquiry response. 4. The non-transitory computer readable storage medium of claim 1 wherein the instructions, when executed by the one or more processors, further cause the machine to: determine, at the first node, if the missing extent has already been requested via the push inquiry response by looking up the missing extent in the extent dictionary. 5. The non-transitory computer readable storage medium of claim 4 wherein the instructions, when executed by the one or more processors, further cause the machine to: refrain from subsequently re-requesting the missing extent based upon the extent dictionary comprising the new entry for the missing extent. 6. The non-transitory computer readable storage medium of claim 4 , wherein the first node searches the extent dictionary in O(log N). 7. The non-transitory computer readable storage medium of claim 1 , wherein the replicated data set is deduplicated at the second node, and when the replicated data set is reconstructed at the first node, the replicated data set maintains the deduplication. 8. The non-transitory computer readable storage medium of claim 1 wherein the instructions, when executed by the one or more processors, further cause the machine to: receive, at the first node, a request for the missing extent from a destination module. 9. The non-transitory computer readable storage medium of claim 1 , wherein the data set comprises a point-in-time image. 10. A method comprising: identifying, at a first node, a missing extent associated with a replicated data set, of an original data set at a second node, to reconstruct at the first node; sending, from the first node to the second node, a push inquiry response to cause the second node to send data of the missing extent based upon a determination that an extent dictionary, maintained by the first node to track previously requested missing extents such that the first node refrains from re-requesting missing extents tracked within the extent dictionary, lacks an entry for the missing extent; adding a new entry into the extent dictionary to indicate that the missing extent has been requested through the push inquiry response; and receiving, from the second node at the first node, a data stream comprising data of the missing extent and a piggyback flag indicating that the first node requested the missing extent through the push inquiry response, wherein the new entry marked as requested in from the extent dictionary based upon receipt of the piggyback flag. 11. The method of claim 10 , comprising: reconstructing, at the first node, the replicated data set from a plurality of extents in response to reception of the missing extent. 12. The method of claim 10 , wherein the data stream further includes the piggyback flag indicating that the missing extent was indicated as missing in the push inquiry response. 13. The method of claim 10 , further comprising: determining, at the first node, if the missing extent has already been requested via the push inquiry response by determining whether the extent dictionary comprises an entry for the missing extent. 14. A first node, comprising: a memory having stored thereon instructions for performing a method; and a processor coupled to the memory, the processor configured to execute the instructions to cause the processor to: identify, at the first node, a missing extent associated with a replicated data set, of an original data set at a second node, to reconstruct at the first node; send, from the first node to the second node, a push inquiry response to cause the second node to send data of the missing extent based upon a determination that an extent dictionary, maintained by the first node to track previously requested missing extents such that the first node refrains from re-requesting missing extents tracked within the extent dictionary, lacks an entry for the missing extent; add a new entry into the extent dictionary to indicate that the missing extent has been requested through the push inquiry response; and receive, from the second node at the first node, a data stream comprising data of the missing extent and a piggyback flag indicating that the first node requested the missing extent through the push inquiry response, wherein the new entry marked as requested in from the extent dictionary based upon receipt of the piggyback flag. 15. The first node of claim 14 , wherein the instructions to cause the processor to reconstruct the replicated data set from a plurality of extents on reception of the missing extent. 16. The first node of claim 14 , wherein the instructions to cause the processor to evaluate the extent dictionary to determine whether missing extents have already been requested via the push inquiry responses. 17. The first node of claim 16 , wherein the instructions to cause the processor to sort the extent dictionary based upon virtual volume block numbers. 18. The first node of claim 16 , wherein entries within the extent dictionary for missing extent identifiers are sorted in O(N*log N) in the extent dictionary for searching in O(log N). 19. The first node of claim 16 , wherein the instructions to cause the processor to receive an indication of the missing extent block from a destination customer. 20. The first node of claim 16 , wherein the instructions to cause the processor to determine, at the first node, if the missing extent has already been requested via the push inquiry response by looking up the missing extent in the extent dictionary.
De-duplication implemented within the file system, e.g. based on file segments (de-duplication techniques in storage systems for the management of data blocks G06F3/0641) · CPC title
Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS] · CPC title
implemented as replicated file system · CPC title
Techniques for file synchronisation in file systems · CPC title
in relation to data integrity, e.g. data losses, bit errors · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.