Push-based piggyback system for source-driven logical replication in a storage environment

US9916100B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9916100-B2
Application numberUS-201414587419-A
CountryUS
Kind codeB2
Filing dateDec 31, 2014
Priority dateDec 14, 2012
Publication dateMar 13, 2018
Grant dateMar 13, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The disclosed techniques enable push-based piggybacking of a source-driven logical replication system. Logical replication of a data set (e.g., a snapshot) from a source node to a destination node can be achieved from a source-driven system while preserving the effects of storage efficiency operations (deduplication) applied at the source node. However, if missing data extents are detected at the destination, the destination has an extent pulling problem as the destination may not have knowledge of the physical layout on the source-side and/or mechanisms for requesting extents. The techniques overcome the extent pulling problem in a source-driven replication system by introducing specific protocols for obtaining missing extents within an existing replication environment by piggybacking data pushes from the source.

First claim

Opening claim text (preview).

What is claimed is: 1. A non-transitory computer readable storage medium having instructions stored thereon, which when executed by one or more processors of a machine, causes the machine to: identify, at a first node, a missing extent of a plurality of extents associated with a replicated data set of an original data set at a second node, wherein the replicated data set is reconstructed at the first node; maintain a table comprising entries mapping missing extent identifiers of missing extents to request indications as to whether the missing extents have already been requested by push inquiry responses sent by the first node to the second node; sort the entries mapping missing extent identifiers to request indications based upon virtual volume block numbers associated with the missing extent identifiers; receive a push inquiry from the second node; send, from the first node for delivery to the second node, a push inquiry response requesting the missing extent, the push inquiry response for causing the second node to initiate a push of data of the missing extent based upon the table indicating that the missing extent has not yet been requested; and receive, at the first node, a data stream pushed from the second node, the data stream comprising both data of the missing extent requested by the first node and a piggyback flag indicating that the missing extent was requested by the first node through the push inquiry response. 2. The non-transitory computer readable storage medium of claim 1 wherein the instructions, when executed by the one or more processors, further cause the machine to: create an entry mapping a missing extent identifier of the missing extent to a request indication that the missing extent has not yet been requested. 3. The non-transitory computer readable storage medium of claim 2 , wherein the instructions, when executed by the one or more processors, further cause the machine to update the request indication within the entry to indicate that the missing extent was requested through the push inquiry response based upon sending the push inquiry response. 4. The non-transitory computer readable storage medium of claim 1 wherein the instructions, when executed by the one or more processors, further cause the machine to: determine, at the first node, if the missing extent has already been requested via a prior push inquiry response by looking up the missing extent in the table. 5. The non-transitory computer readable storage medium of claim 1 wherein the instructions, when executed by the one or more processors, further cause the machine to: query the table to identify a second entry for a second missing extent; determine that the second entry comprises a second request indication that the second missing extent has already been requested; and determine that the second missing extent is not to be requested through the push inquiry response based upon the second missing extent already being request. 6. The non-transitory computer readable storage medium of claim 4 , wherein the missing extent identifiers are sorted in the table for searching in O(log N). 7. The non-transitory computer readable storage medium of claim 1 , wherein the replicated data set is deduplicated at the second node, and when the replicated data set is reconstructed at the first node, the replicated data set maintains the deduplication. 8. The non-transitory computer readable storage medium of claim 1 wherein the instructions, when executed by the one or more processors, further cause the machine to: receive, at the first node, a request for the missing block from a destination module. 9. The non-transitory computer readable storage medium of claim 1 , wherein the data set comprises a point-in-time image. 10. A method comprising: receiving, at a first node of a source-driven replication system, a first data stream including a plurality of extents associated with a data set; identifying, at the first node, a missing extent of the plurality of extents associated with the data set; maintaining a table comprising entries mapping missing extent identifiers of missing extents to request indications as to whether the missing extents have already been requested by push inquiry responses sent by the first node to the second node; sorting the entries mapping missing extent identifiers to request indications based upon virtual volume block numbers associated with the missing extent identifiers; receiving a push inquiry from a second node; sending, from the first node for delivery to the second node, a push inquiry response requesting the missing extent, the push inquiry response for causing the second node to initiate a push of data of the missing extent based upon the table indicating that the missing extent has not yet been requested; and receiving, at the first node, a second data stream including both data of the missing extent requested by the first node and a piggyback flag indicating that the missing extent was requested by the first node through the push inquiry response. 11. The method of claim 10 , wherein the first data stream does not include the piggyback flag and the second data stream includes the piggyback flag indicating that the missing extent was indicated as missing in the push inquiry response. 12. The method of claim 10 , further comprising: querying the table to identify a second entry for a second missing extent; determining that the second entry comprises a second request indication that the second missing extent has already been requested; and determining that the second missing extent is not to be requested through the push inquiry response based upon the second missing extent already being request. 13. The method of claim 10 , further comprising: determining, at the first node, if the missing extent has already been requested via a prior push inquiry response by looking up the missing extent in the table. 14. A first node, comprising: a memory having stored thereon instructions for performing a method; and a processor coupled to the memory, the processor configured to execute the instructions to cause the processor to: receive, from a second node, a plurality of extents associated with a replicated data set, wherein the replicated data set is reconstructed at the first node; identify a missing extent of the plurality of extents associated with the replicated data set; maintain a table comprising entries mapping missing extent identifiers of missing extents to request indications as to whether the missing extents have already been requested by push inquiry responses sent by the first node to the second node; sort the entries mapping missing extent identifiers to request indications based upon virtual volume block numbers associated with the missing extent identifiers; receive a push inquiry from the second node; send, for delivery to the second node, a push inquiry response requesting the missing extent, wherein the push inquiry response is for causing the second node to initiate a push of data of the missing extent based upon the table indicating that the missing extent has not yet been requested; and receive a data stream pushed from the second node, the data stream comprising both data of the missing extent requested by the first node and a piggyback flag indicating that the missing extent was requested by the first node through the push inquiry response. 15. The first node of claim 14 , wherein the instructions to cause the processor to create an entry mapping a missing extent identifier of the missing extent to a request indication that the missing extent has not

Assignees

Inventors

Classifications

  • Replication mechanisms · CPC title

  • G06F3/0619Primary

    in relation to data integrity, e.g. data losses, bit errors · CPC title

  • implemented as replicated file system · CPC title

  • De-duplication implemented within the file system, e.g. based on file segments (de-duplication techniques in storage systems for the management of data blocks G06F3/0641) · CPC title

  • G06F16/178Primary

    Techniques for file synchronisation in file systems · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9916100B2 cover?
The disclosed techniques enable push-based piggybacking of a source-driven logical replication system. Logical replication of a data set (e.g., a snapshot) from a source node to a destination node can be achieved from a source-driven system while preserving the effects of storage efficiency operations (deduplication) applied at the source node. However, if missing data extents are detected at t…
Who is the assignee on this patent?
Netapp Inc
What technology area does this patent fall under?
Primary CPC classification G06F3/0619. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 13 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).