System and method for resolving master node failures within node clusters

US11082288B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11082288-B2
Application numberUS-201916356072-A
CountryUS
Kind codeB2
Filing dateMar 18, 2019
Priority dateOct 24, 2016
Publication dateAug 3, 2021
Grant dateAug 3, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Fault tolerance techniques for a plurality of nodes executing application thread groups include executing at least a portion of a first application thread group based on a delegation by a first node, wherein the first node delegates an execution of the first application thread group amongst the plurality of nodes and has a highest priority indicated by an ordered priority of the plurality of nodes. A failure of the first node can be identified based on the first node failing to respond to a message sent to it. A second node can then be identified as having a next highest priority indicated by the ordered priority such that the second node can delegate an execution of a second application thread group amongst the plurality of nodes.

First claim

Opening claim text (preview).

What is claimed is: 1. A fault tolerance system, comprising: a first data center comprising a first plurality of processors configured to host a first plurality of nodes configured to process requests received from a global traffic manager; and a second data center comprising a second plurality of processors configured to host a second plurality of nodes that each correspond to one of the first plurality of nodes; and wherein at least one of the first plurality of processors or the second plurality of processors is configured to host at least one database that is accessible to each of the first plurality of nodes and the second plurality of nodes; and wherein the first plurality of nodes comprises a first node having a highest priority and designated as a master node in the at least one database, wherein the master node is configured to delegate the requests among the first plurality of nodes, and wherein the second plurality of nodes comprises a second node that, in response to the second node receiving the requests from the global traffic manager, is configured to perform actions comprising: registering the second node with the at least one database; querying the at least one database for an identifier of a replacement node for the master node; determining that the second node is the replacement node based on the identifier of the replacement node; updating a status of the first node in the at least one database to indicate that the first node is no longer the master node; and updating a status of the second node in the at least one database to designate the second node as the master node, wherein the master node is configured to delegate the requests among the second plurality of nodes. 2. The system of claim 1 , wherein the first data center is an active data center and the second data center is a passive data center, and wherein each of the second plurality of nodes is a backup node that corresponds to one of the first plurality of nodes. 3. The system of claim 1 , wherein the first plurality of nodes and the second plurality of nodes comprise application nodes, database nodes, or a combination thereof. 4. The system of claim 1 , wherein each of the first plurality of nodes and each of the second plurality of nodes comprises a virtual machine instance configured to execute at least a portion of an application thread group. 5. The system of claim 1 , wherein, to register the second node with the at least one database, the second node is configured to: update the at least one database to indicate a registration time at which the second node registers with the at least one database. 6. The system of claim 1 , wherein a master node replacement priority of the first plurality of nodes and the second plurality of nodes is based on respective registration times of the first plurality of nodes and the second plurality of nodes in the at least one database. 7. The system of claim 1 , wherein the at least one database comprises a first database of the first data center and a second database of the second data center, wherein the first database and the second database are communicatively coupled and synchronize data between the first database and the second database. 8. The system of claim 1 , wherein the system comprises the global traffic manager and the global traffic manager comprises a processor configured to perform actions comprising: routing the requests to the second plurality of nodes; and determining that the first data center has been restored, and in response, routing the requests to the first plurality of nodes of the first data center instead of the second plurality of nodes of the second data center. 9. The system of claim 8 , wherein, in response to the processor of the global traffic manager determining that the first data center has been restored and the first node of the first plurality of nodes receiving the requests from the global traffic manager, the first node is configured to perform actions comprising: determining that the first node has the highest priority in the at least one database; updating the status of the second node in the at least one database to indicate that the second node is no longer the master node; and updating the status of the first node in the at least one database to designate the first node as the master node, wherein the master node is configured to delegate the requests among the first plurality of nodes. 10. The system of claim 1 , wherein the second plurality of nodes comprises a third node that, in response to receiving the requests from the global traffic manager, is configured to perform actions comprising: registering the third node with the at least one database; querying the at least one database for the identifier of the replacement node for the master node; determining that the second node is the replacement node based on the identifier of the replacement node; and waiting to be delegated a portion of the requests by the second node. 11. The system of claim 1 , wherein the second plurality of nodes comprises a third node configured to perform actions comprising: registering the third node with the at least one database; performing a portion of the requests based on a delegation of the portion by the second node; sending a message to the second node; identifying a failure of the second node based on the second node failing to respond to the message; querying the at least one database for the identifier of the replacement node for the master node; determining that the third node is the replacement node based on the identifier of the replacement node; updating the status of the second node in the at least one database to indicate that the second node is no longer the master node; and updating a status of the third node in the at least one database to designate the third node as the master node, wherein the master node is configured to delegate the requests among the second plurality of nodes. 12. A method of operating a fault tolerant system, wherein the fault tolerant system comprises a first data center comprising a first plurality of nodes configured to process requests received from a global traffic manager; and a second data center comprising a second plurality of nodes that each correspond to one of the first plurality of nodes; and at least one database that is accessible to each of the first plurality of nodes and the second plurality of nodes, wherein the first plurality of nodes comprises a first node having a highest priority and designated as a master node in the at least one database, wherein the master node delegates the requests among the first plurality of nodes, and wherein the second plurality of nodes comprises a second node that performs the method, comprising: receiving the requests from the global traffic manager at the second data center; registering the second node with the at least one database; sending a message to the first node; identifying a failure of the first node based on the first node failing to respond to the message; querying the at least one database for an identifier of a replacement node for the master node; determining that the second node is the replacement node based on the identifier of the replacement node; updating a status of the first node in the at least one database to indicate that the first node is no longer the master node; and updating a status of the second node in the at least one database to designate the second node as the master node, wherein the master node delegates the requests among the second plurality of nodes. 13. The method of claim 12 , wherein the at least one database comprises a first datab

Assignees

Inventors

Classifications

  • Active monitoring, e.g. heartbeat, ping or trace-route · CPC title

  • comprising hierarchical management structures · CPC title

  • Performing the actions predefined by failover planning, e.g. switching to standby network elements · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11082288B2 cover?
Fault tolerance techniques for a plurality of nodes executing application thread groups include executing at least a portion of a first application thread group based on a delegation by a first node, wherein the first node delegates an execution of the first application thread group amongst the plurality of nodes and has a highest priority indicated by an ordered priority of the plurality of no…
Who is the assignee on this patent?
Servicenow Inc
What technology area does this patent fall under?
Primary CPC classification H04L41/0663. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Aug 03 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).