Locality based quorums

US10127123B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10127123-B2
Application numberUS-201715413764-A
CountryUS
Kind codeB2
Filing dateJan 24, 2017
Priority dateDec 14, 2010
Publication dateNov 13, 2018
Grant dateNov 13, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed are various embodiments for distributing data items within a plurality of nodes. A data item that is subject to a data item update request is updated from a master node to a plurality of slave notes. The update of the data item is determined to be locality-based durable based at least in part on acknowledgements received from the slave nodes. Upon detection that the master node has failed, a new master candidate is determined via an election among the plurality of slave nodes.

First claim

Opening claim text (preview).

Therefore, the following is claimed: 1. A method comprising: transitioning from a current master node of a distributed data store to a newly selected master node responsive to a locality-based failover quorum requirement being satisfied, the locality-based failover quorum requirement requiring at least a threshold number of nodes in different fault domains to have participated in a data recovery for the distributed data store with the newly selected master node, wherein the distributed data store only transitions to the newly selected master node after the locality-based failover quorum requirement is satisfied. 2. The method of claim 1 , further comprising: performing an election among a set of slave nodes to the current master node to select one of the slave nodes as the newly selected master node. 3. The method of claim 2 , wherein the election is performed in accordance with a Paxos election scheme. 4. The method of claim 1 , wherein the threshold number of nodes in different fault domains comprises at least two nodes in at least two different physical locations. 5. The method of claim 4 , wherein the at least two different physical locations are in at least two different availability zones; or wherein the at least two different physical locations are in at least two different data centers. 6. The method of claim 1 , further comprising: receiving, by the new master node, a data item update request; and acknowledging the data item update request as being committed in the distributed data store in response to a locality-based quorum requirement being satisfied, the locality-based quorum requirement requiring nodes in at least a threshold number of different fault domains to have acknowledged committing the data item update. 7. The method of claim 6 , wherein the locality-based quorum requirement requires nodes in a durability requirement number of data centers (K) out of a total number of data centers (N) to have acknowledged committing the data item update. 8. The method of claim 7 , wherein the locality-based failover quorum requirement requires nodes in K−N+1 data centers to have participated in the data recovery. 9. A distributed data storage system, comprising: a plurality of nodes located in at least two or more different physical locations, wherein at least one of the nodes: acknowledges, to a client, a data item update as being committed in the distributed data storage system upon a locality-based quorum requirement being satisfied, the locality-based quorum requirement requiring nodes in at least a threshold number of geographically-different physical locations to have committed the data item update, wherein two or more of the plurality of nodes are located in a same location of the geographically-different physical locations such that committing the data item update at the threshold number of any of the nodes is not guaranteed to satisfy the locality-based quorum requirement. 10. The distributed data storage system of claim 9 , wherein the at least two or more different physical locations are located in different availability zones of the distributed data storage system, and wherein the locality-based quorum requirement requires nodes in at least two different availability zones to have acknowledged committing the data item update. 11. The distributed data storage system of claim 9 , wherein at least some of the nodes of the distributed data storage system are located in two or more different data centers, wherein the locality-based quorum requirement requires nodes in a durability requirement number of data centers (K) out of a total number of the two or more data centers (N) to have acknowledged committing the data item update. 12. The distributed data storage system of claim 11 , wherein other ones of the nodes perform an election to select a new master node in response to a failure of the at least one of the nodes, and wherein upon a locality-based failover quorum requirement being satisfied that requires at least a threshold number of the nodes in different fault domains to have participated in a data recovery with the newly selected master node, the newly selected master node transitions to function as a master node for the distributed data storage system. 13. The distributed data storage system of claim 12 , wherein the locality-based failover quorum requirement requires nodes in K−N+1 data centers to have participated in the data recovery with the newly selected master node. 14. The distributed data storage system of claim 9 , wherein the distributed data storage system operates according to a single-master replication model. 15. A non-transitory, computer-readable storage medium, storing program instructions that when executed by one or more computing devices cause the one or more computing devices to: determine whether nodes in at least a threshold number of geographically-different physical locations of a distributed data store have acknowledged a data item update as being committed, wherein two or more of the nodes are located in a same location of the geographically-different physical locations such that committing the data item update at the threshold number of any of the nodes does not guarantee that nodes in at least a threshold number of the geographically-different physical locations have committed the data item update; and acknowledge, to a client, commitment of the data item update, responsive to the data item update being committed in at least the threshold number of geographically-different physical locations. 16. The non-transitory, computer readable storage medium of claim 15 , wherein different physical locations comprise different availability zones. 17. The non-transitory, computer readable storage medium of claim 16 , wherein the data item update being committed in at least the threshold number of different physical locations comprises a locality-based quorum requirement being satisfied, wherein the locality-based quorum requirement requires nodes in a durability requirement number of data centers (K) out of a total number of data centers (N) to have acknowledged the data item update as being committed. 18. The non-transitory, computer readable storage medium of claim 17 , wherein the program instructions when executed by the one or more computing devices further cause the one or more computing devices to: perform an election in response to a failure of a current master node; and upon a locality-based failover quorum requirement being satisfied that requires at least a threshold number of the nodes in different data centers to have participated in data recovery with the newly selected master node, transition the newly selected master node to function as a master node for the distributed data store. 19. The non-transitory, computer readable storage medium of claim 18 , wherein the locality-based failover quorum requirement requires nodes in K−N+1 data centers to have participated in data recovery with the newly selected master node. 20. The non-transitory, computer readable storage medium of claim 15 , wherein the program instructions when executed by the one or more computing devices further cause the one or more computing devices to: send an error code in response to the data item update not being committed in at least the threshold number of different physical locations within a predetermined amount of time.

Assignees

Inventors

Classifications

  • where the redundant components share persistent storage (G06F11/2043 takes precedence) · CPC title

  • with more than one idle spare processing component · CPC title

  • eliminating a faulty processor or activating a spare · CPC title

  • Error detection; Error correction; Monitoring (error detection, correction or monitoring in information storage based on relative movement between record carrier and transducer G11B20/18; monitoring, i.e. supervising the progress of recording or reproducing G11B27/36; in static stores G11C29/00) · CPC title

  • maintaining the standby controller/processing unit updated (initialisation or re-synchronisation thereof G06F11/1658 and subgroups) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10127123B2 cover?
Disclosed are various embodiments for distributing data items within a plurality of nodes. A data item that is subject to a data item update request is updated from a master node to a plurality of slave notes. The update of the data item is determined to be locality-based durable based at least in part on acknowledgements received from the slave nodes. Upon detection that the master node has fa…
Who is the assignee on this patent?
Amazon Tech Inc
What technology area does this patent fall under?
Primary CPC classification G06F11/2025. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 13 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).