System and method for data replication using a single master failover protocol

US2020228393A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2020228393-A1
Application numberUS-202016833334-A
CountryUS
Kind codeA1
Filing dateMar 27, 2020
Priority dateJan 17, 2012
Publication dateJul 16, 2020
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system that implements a data storage service may store data on behalf of storage service clients. The system may maintain data in multiple replicas of various partitions that are stored on respective computing nodes in the system. The system may employ a single master failover protocol, usable when a replica attempts to become the master replica for a replica group of which it is a member. Attempting to become the master replica may include acquiring a lock associated with the replica group, and gathering state information from the other replicas in the group. The state information may indicate whether another replica supports the attempt (in which case it is included in a failover quorum) or stores more recent data or metadata than the replica attempting to become the master (in which case synchronization may be required). If the failover quorum includes enough replicas, the replica may become the master.

First claim

Opening claim text (preview).

1 .- 35 . (canceled) 36 . A system, comprising: a plurality of replicas that collectively form a replica group, wherein the plurality of replicas store data on respective computing nodes of a plurality of computing nodes that collectively implement a data store, wherein at most one of the plurality of replicas can perform a first role for the replica group, and wherein, to attempt to assume the first role for the replica group, a candidate replica of the plurality of replicas is configured to: acquire a lock from an external lock manager, and responsive to acquisition of the lock: include, in a failover quorum, ones of the plurality of replicas other than the candidate replica identified to support the attempt of the candidate replica to assume the first role; and assume the first role, based on a determination that a number of replicas included in the failover quorum meets or exceeds a pre-determined number of replicas. 37 . The system of claim 36 , wherein: the replica group maintains an indicator of membership version, and to identify that a replica supports the attempt of the candidate replica to assume the first role, the candidate replica is configured to: determine that the replica has not observed a more recent membership version than a most recent membership version observed by the candidate replica, wherein the most recent membership version is incremented each time a membership change is made in the replica group. 38 . The system of claim 36 , wherein to identify that a replica supports the attempt of the candidate replica to assume the first role, the candidate replica is configured to: determine that the replica has not seen a more recent value for the lock than a most recent lock value acquired by the candidate replica, wherein the most recent lock value is incremented each time the lock is acquired by a different replica. 39 . The system of claim 36 , the candidate replica further configured to: gather state information from at least some of the plurality of replicas other than the candidate replica until: the state information has been gathered from all of the plurality of replicas other than the candidate replica, it is determined that there are not enough replicas supporting the attempt to be able to add the pre-determined number of replicas to the failover quorum, or a pre-determined time limit is reached. 40 . The system of claim 39 , wherein to identify that a replica supports the attempt of the candidate replica to assume the first role, the candidate replica is configured to: determine that the replica is hosted on a computing node from which state information for the replica is gathered. 41 . The system of claim 36 , wherein the attempt to assume the first role for the replica group is performed in response to: a failure of a current replica performing the first role, a failure of a computing node on which the current replica performing the first role is hosted, a communication failure between the current replica performing the first role and one or more other components of the data store, or a membership change in the replica group. 42 . The system of claim 36 , wherein the pre-determined number of replicas is expressed in terms of a number of replicas stored on computing nodes in each of a particular number of different locations. 43 . A computer-implemented method, comprising: attempting, by a candidate replica of a plurality of replicas that collectively form a replica group, to assume a first role for the replica group, wherein the plurality of replicas store data on respective computing nodes of a plurality of computing nodes that collectively implement a data store, wherein at most one of the plurality of replicas can perform the first role for the replica group, and wherein attempting to assume the first role comprises: acquiring a lock from an external lock manager, and responsive to acquiring the lock: including, in a failover quorum, ones of the plurality of replicas other than the candidate replica identified to support the attempt of the candidate replica to assume the first role; and assuming the first role in response to determining, based on a number of replicas included in the failover quorum meeting or exceeding a pre-determined number of replicas, that the candidate replica can assume the first role. 44 . The computer-implemented method of claim 43 , wherein the replica group maintains an indicator of membership version, wherein identifying that a replica supports the attempt of the candidate replica to assume the first role comprises: determining that the replica has not observed a more recent membership version than a most recent membership version observed by the candidate replica, wherein the most recent membership version is incremented each time a membership change is made in the replica group. 45 . The computer-implemented method of claim 43 , wherein identifying that a replica supports the attempt of the candidate replica to assume the first role comprises: determining that the replica has not seen a more recent value for the lock than a most recent lock value acquired by the candidate replica, wherein the most recent lock value is incremented each time the lock is acquired by a different replica. 46 . The computer-implemented method of claim 43 , further comprising: gathering state information from at least some of the plurality of replicas other than the candidate replica until: the state information has been gathered from all of the plurality of replicas other than the candidate replica, it is determined that there are not enough replicas supporting the attempt to be able to add the pre-determined number of replicas to the failover quorum, or a pre-determined time limit is reached. 47 . The computer-implemented method of claim 46 , wherein identifying that a replica supports the attempt of the candidate replica to assume the first role comprises: determining that the replica is hosted on a computing node from which state information for the replica is gathered. 48 . The computer-implemented method of claim 43 , wherein the attempt to assume the first role for the replica group is performed in response to: a failure of a current replica performing the first role, a failure of a computing node on which the current replica performing the first role is hosted, a communication failure between the current replica performing the first role and one or more other components of the data store, or a membership change in the replica group. 49 . The computer-implemented method of claim 43 , wherein the pre-determined number of replicas is expressed in terms of a number of replicas stored on computing nodes in each of a particular number of different locations. 50 . One or more non-transitory, computer-readable storage media storing program instructions that when executed on or across one or more processors cause the one or more processors to perform: attempting, by a candidate replica of a plurality of replicas that collectively form a replica group, to assume a first role for the replica group, wherein the plurality of replicas store data on respective computing nodes of a plurality of computing nodes that collectively implement a data store, wherein at most one of the plurality of replicas can perform the first role for the replica group, and wherein attempting to assume the first role comprises: acquiring a lock from an external lock manager, and responsive to acquiring the lock: including, in a failover quorum, ones of the plurality of replicas other than the candidate replica identif

Assignees

Inventors

Classifications

  • maintaining the standby controller/processing unit updated (initialisation or re-synchronisation thereof G06F11/1658 and subgroups) · CPC title

  • Discovery or management thereof, e.g. service location protocol [SLP] or web services · CPC title

  • where the redundant components share neither address space nor persistent storage · CPC title

  • with more than one idle spare processing component · CPC title

  • Command handling arrangements, e.g. command buffers, queues, command scheduling · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2020228393A1 cover?
A system that implements a data storage service may store data on behalf of storage service clients. The system may maintain data in multiple replicas of various partitions that are stored on respective computing nodes in the system. The system may employ a single master failover protocol, usable when a replica attempts to become the master replica for a replica group of which it is a member. A…
Who is the assignee on this patent?
Amazon Tech Inc
What technology area does this patent fall under?
Primary CPC classification G06F11/2097. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jul 16 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).