Locality based quorums
US-10127123-B2 · Nov 13, 2018 · US
US11507480B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11507480-B2 |
| Application number | US-201816185423-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 9, 2018 |
| Priority date | Dec 14, 2010 |
| Publication date | Nov 22, 2022 |
| Grant date | Nov 22, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Disclosed are various embodiments for distributing data items within a plurality of nodes. A data item that is subject to a data item update request is updated from a master node to a plurality of slave notes. The update of the data item is determined to be locality-based durable based at least in part on acknowledgements received from the slave nodes. Upon detection that the master node has failed, a new master candidate is determined via an election among the plurality of slave nodes.
Opening claim text (preview).
Therefore, the following is claimed: 1. A method, comprising: replicating by a master node for data stored in a distributed data store, a write to the data received by the master node to a plurality of other nodes of the distributed data store that replicate the data, wherein individual ones of the plurality of other nodes are in different respective data centers, and wherein performance of replication of the write to the data is based at least in part on a number of the plurality of other nodes that acknowledge writing to the data at the different respective data centers; detecting, by at least one of the plurality of other nodes, that the master node has failed based at least in part on a number of the plurality of other nodes that have acknowledged writing to the data at the different respective data centers; determining, among the plurality of other nodes, which one of the plurality of other nodes is to transition to being a new master node for the data; and performing a second write to the data using the new master node. 2. The method of claim 1 , wherein the write to the data is received from a client of the distributed data store. 3. The method of claim 1 , further comprising: after replicating the write, confirming, by the master node, to a client of the distributed data store that the write to the data is durable. 4. The method of claim 1 , wherein determining, among the plurality of other nodes, which one of the plurality of other nodes is to transition to being the new master node for the data comprises: determining a new master candidate through an election among the plurality of other nodes in response to detecting that the master node has failed; and transitioning the new master candidate to become the new master node when a number of the plurality of other nodes that select the new master candidate meets a failover quorum based at least in part upon a respective location of the plurality of other nodes that select the new master candidate. 5. The method of claim 4 , wherein N is a total number of the different respective data centers, wherein K is a durability requirement, and wherein the method further comprises waiting, by the new master candidate, to perform recovery from the failover quorum, wherein the failover quorum is met when recovery is performed by the plurality of other nodes that reside within N-K+ 1 of the different respective data centers. 6. The method of claim 4 , wherein the election employs a Paxos election scheme. 7. The method of claim 1 , wherein detecting that the master node has failed is based on a message received from the master node at the at least one node. 8. A method, comprising: determining, via at least one of a plurality of nodes that replicate data stored in a distributed data store, a new master candidate through an election among the plurality of nodes in response to detecting, by at least one of the plurality of nodes, that a master node for the data has failed based at least in part on a number of the plurality of nodes that have acknowledged writing to the data subject to a prior write request replicated by the master node to the plurality of nodes, wherein individual ones of the plurality of nodes are in respective data centers; and transitioning, via at least one of the plurality of nodes, the new master candidate to become a new master node for the data after waiting to recover data from a number of the plurality of nodes that meets a failover quorum based at least in part on a respective location of the plurality of nodes from which the data is recovered. 9. The method of claim 8 , further comprising writing, via at least one of the plurality of nodes, to the data that is subject to the prior write request from the master node to the plurality of nodes. 10. The method of claim 8 , wherein the election employs a Paxos election scheme. 11. A system, comprising: a plurality of computing devices that implement a distributed data store comprising a master node for data stored in the distributed data store and a plurality of other nodes that replicate the data, wherein individual ones of the plurality of other nodes are in different respective data centers; wherein the distributed data store is configured to: replicate a write to the data received by the master node to the plurality of other nodes, wherein performance of replication of the write to the data is based on a number of the plurality of other nodes that acknowledge writing to the data at the different respective data centers; detect, by at least one of the plurality of other nodes, that the master node has failed based at least in part on a number of the plurality of other nodes that have acknowledged writing to the data subject to write request at the different respective data centers; determine, among the plurality of other nodes, which one of the plurality of other nodes is to transition to being a new master node for the data; and perform a second write to the data using the new master node. 12. The system of claim 11 , wherein to determine, among the plurality of other nodes, which one of the plurality of other nodes is to transition to being the new master node for the data, the distributed data store is configured to: determine a new master candidate through an election among the plurality of other nodes in response to detecting that the master node has failed; and transition the new master candidate to become the new master node when a number of the plurality of other nodes that select the new master candidate meets a failover quorum based at least in part upon a respective location of the plurality of other nodes that select the new master candidate. 13. The system of claim 12 , wherein the election employs a Paxos election scheme. 14. The system of claim 12 , wherein to replicate the write, the distributed data store is configured to: determine whether the write to the data is locality-based durable based at least in part on a respective location of the plurality of other nodes that have acknowledged replicating the write to the master node. 15. The system of claim 14 , wherein the write to the data is determined to be locality-based durable when at least a subset of the plurality of other nodes residing in K data centers out of N data centers have acknowledged that the data item has been updated, wherein N is a total number of the different data centers. 16. The system of claim 14 , wherein the master node is detected as failed after the write to the data is determined not to be locality-based durable. 17. The system of claim 14 , wherein the master node: receives the write to the data from a client of the distributed data store; and confirms to the client that the write to the data is durable in response to determining that the write request to the data is locality-based durable. 18. The system of claim 14 , wherein the master node sends an error code to a client of the distributed data store when the write to the data is not determined to be locality-based durable after a predefined amount of time. 19. The system of claim 14 , wherein N is a total number of the different respective data centers, wherein K is a durability requirement, and wherein the distributed data store is further configured to: wait, by the new master candidate, to perform recovery from the failover quorum, wherein the failover quorum is met when recovery is performed by the plurality of other nodes that reside within N−K+1 of the different respective data centers.
using centralised failover control functionality · CPC title
with more than one idle spare processing component · CPC title
where the redundant components share persistent storage (G06F11/2043 takes precedence) · CPC title
Error detection; Error correction; Monitoring (error detection, correction or monitoring in information storage based on relative movement between record carrier and transducer G11B20/18; monitoring, i.e. supervising the progress of recording or reproducing G11B27/36; in static stores G11C29/00) · CPC title
eliminating a faulty processor or activating a spare · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.