High availability cluster management of computing nodes

US11061719B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11061719-B2
Application numberUS-201916269511-A
CountryUS
Kind codeB2
Filing dateFeb 6, 2019
Priority dateDec 27, 2018
Publication dateJul 13, 2021
Grant dateJul 13, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques and solutions are described for providing high-availability computing resources to service client requests. Groups of computing nodes are organized into loops, a given loop being configured to execute a particular subset of tasks, such as tasks with a hash value in a particular ranged serviced by a loop. Computing nodes within a loop can evaluate a task request to determine whether the task request conflicts with another task currently assigned to a node. If a computing node which sent out a task request determines that no conflict was identified, it can execute the task request. Communications within a loop can occur unidirectionally, such that a node which initiated a communication will receive the communication from the last loop node. Loops can be connected to form a ribbon, the ribbon providing a namespace for task execution, where hash ranges for the namespace are uniquely assigned to loops of the ribbon.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, implemented in a computing system, the method comprising: connecting a plurality of computing nodes in a plurality of loops, the connecting comprising connecting a first plurality of computing nodes selected from the plurality of computing nodes in a first loop and connecting a second plurality of computing nodes selected from the plurality of computing nodes in a second loop, where computing nodes comprise one or more processors and at least one memory coupled to the one or more processors, the first plurality of computing nodes of the first loop configured to operate at least one service configured to process tasks from one or more clients, wherein: (1) a given computing node of the first plurality of computing nodes of the first loop is connected to a left neighbor computing node and a right neighbor computing node of the first plurality of computing nodes, and wherein the first plurality of computing nodes of the first loop pass communications unidirectionally about the first loop; (2) the first loop comprises at least a first connect node and at least a second connect node of the first plurality of computing nodes, wherein the at least first and second connect nodes pass messages to, or receive messages from, one or more loops of the plurality of loops, wherein the second loop is in communication with at least the at least a first connect node, the second loop comprising at least one computing node that is not a member of the first loop; (3) the first loop comprises at least one normal node of the first plurality of nodes, wherein the at least one normal node is connected to the at least the first connect node and is not a member of a loop other than the first loop; receiving a first task request from a first client device at a first computing node of the plurality of computing nodes; determining that the first task request does not conflict with a task assigned to the first computing node; sending the first task request to a first neighbor computing node of the first computing node in a direction of loop communication message passing, the first neighbor computing node being the left neighbor computing node or the right neighbor computing node of the first computing node; determining if a node of the plurality of computing nodes other than the first computing node has a conflict with the first task request or does not have a conflict with the first task request, the determining whether a node has a conflict comprises: (1) the first computing node receives the first task request from a second neighbor node, the second neighbor node being a member of at least one loop of the plurality of loops, the at least one loop of the plurality of loops being a loop of the plurality of loops other than the first loop, and the receiving the first task request comprises an indication of whether a conflict was identified between the first task request and a computing node of the plurality of computing nodes other than the first computing node; or (2) the first task request is passed between multiple loops of the plurality of loops in the direction of loop communication message passing and the first computing node determines that no conflict exists with another computing node of the plurality of computing nodes if a notification is not received within a determined period of time; and executing the first task request. 2. The method of claim 1 , further comprising: prior to executing the first task request, determining from metadata associated with the first task request that another computing node of the first plurality of computing nodes of the first loop does not have a conflict with the task. 3. The method of claim 1 , further comprising: at a second computing node of the plurality of computing nodes, receiving a second task request from a first neighbor computing node of the second computing node, the first neighbor computing node being the left neighbor computing node or the right neighbor computing node of the second computing node; determining that the second task request conflicts with the first task request; and not forwarding the second task request to a second neighbor computing node of the second computing node, wherein when the first neighbor computing node is the left neighbor computing node the second neighbor computing node is the right neighbor computing node and when the first neighbor computing node is the right neighbor computing node the second neighbor computing node is the left neighbor computing node. 4. The method of claim 1 , further comprising: at a second computing node of the plurality of computing nodes, receiving a second task request from a first neighbor computing node of the second computing node, the first neighbor computing node being the left neighbor computing node or the right neighbor computing node of the second computing node; determining that the second task request has a conflict with the first task request; indicating the conflict in metadata associated with the second task request; and passing the second task request to a second neighbor computing node of the second computing node, wherein when the first neighbor computing node is the left neighbor computing node the second neighbor computing node is the right neighbor computing node and when the first neighbor computing node is the right neighbor computing node the second neighbor computing node is the left neighbor computing node. 5. The method of claim 1 , further comprising: determining that a number of computing nodes exceeds a threshold based on loop performance parameters; and causing at least one computing node to be removed from the first loop. 6. The method of claim 1 , further comprising: for a second computing node of the plurality of computing nodes, determining that a first neighbor node of the second computing node is not reachable, wherein the first neighbor node of the second computing node has the second computing node as a first neighbor node and a third computing node as a second neighbor node; and changing the first neighbor node of the second computing node in the direction of loop communication message passing to the third computing node. 7. The method of claim 1 , wherein computing nodes in the first loop communicate with only two other computing nodes in the first loop for purposes of passing task requests. 8. The method of claim 1 , wherein a direction of loop communication message passing in the second loop is a reverse of the direction of loop communication message passing in the first loop. 9. The method of claim 1 , wherein the first and second loops at least a part of a ribbon structure that optionally comprises one or more additional loops, wherein loops communicate through connect nodes unidirectionally in a direction of message passing for the ribbon structure, wherein a message sent by the first loop can traverse loops of the ribbon structure in the direction of message passing for the ribbon structure such that the first loop can receive the message after the message has passed through the other loops of the ribbon structure. 10. The method of claim 9 , wherein each loop of the ribbon structure is associated with a unique hash range for hash values associated with a namespace for task requests. 11. The method of claim 9 , the method further comprising: determining, by the first computing node, that a hash range associated with the first loop should be altered; updating the hash range associated with the first loop; and communicating a hash range update to a computing node of the second loop, wherein the second loop updates the hash range of the second loop to conform with the updated hash range of the first loop.

Assignees

Inventors

Classifications

  • considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration (scheduling strategies G06F9/4881 and subgroups) · CPC title

  • resumption being on a different machine, e.g. task migration, virtual machine migration (G06F9/5088 takes precedence) · CPC title

  • Program initiating; Program switching, e.g. by interrupt · CPC title

  • to service a request · CPC title

  • Ring-type networks with a headend · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11061719B2 cover?
Techniques and solutions are described for providing high-availability computing resources to service client requests. Groups of computing nodes are organized into loops, a given loop being configured to execute a particular subset of tasks, such as tasks with a hash value in a particular ranged serviced by a loop. Computing nodes within a loop can evaluate a task request to determine whether t…
Who is the assignee on this patent?
Sap Se
What technology area does this patent fall under?
Primary CPC classification G06F9/5077. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 13 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).