Dynamic and adaptive approach for failure detection of node in a cluster

US10795745B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10795745-B2
Application numberUS-201715833414-A
CountryUS
Kind codeB2
Filing dateDec 6, 2017
Priority dateOct 27, 2014
Publication dateOct 6, 2020
Grant dateOct 6, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure discloses a method and a network device for failure detection of nodes in a cluster. Specifically, a network device transmits data to another device at a first time. The network device then receives an acknowledgment of the data from the second device at a second time. Next, the network device determines a Round Trip Time (RTT) for the first device and the second device based on the first time and the second time. Based on the RTT, the network device determines a first frequency for transmitting a heartbeat protocol message between the first device and the second device, and transmits a heartbeat protocol message between the first device and the second device at the first frequency.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: transmitting, by a first network device, data to a second network device over a communication link at a first time; receiving, by the first network device, an acknowledgment of receipt of the data over the communication link from the second network device at a second time; determining, by the first network device, a communication Round Trip Time (RTT) between the first network device and the second network device based on the first time and the second time; determining a first frequency that is larger than the RTT; transmitting a heartbeat protocol message between the first network device and the second network device at the first frequency; re-determining the RTT based on current network conditions; and dynamically updating the first frequency for transmitting the heartbeat protocol message by increasing the first frequency of transmitting the heartbeat protocol message in response to determining a reduction in the RTT or by decreasing the first frequency of transmitting the heartbeat protocol message in response to determining an increase in the RTT, wherein the heartbeat messages represent an active status of the first network device and the acknowledgment represents an active status of the second network device. 2. The method of claim 1 , wherein transmitting the heartbeat protocol message between the first network device and the second network device at the first frequency comprises the first network device transmitting the heartbeat protocol message to the second network device at the first frequency. 3. The method of claim 2 , wherein the operations further comprise the second network device transmitting the heartbeat protocol message to the first network device at a second frequency different than the first frequency. 4. The method of claim 1 , wherein transmitting the heartbeat protocol message between the first network device and the second network device at the first frequency comprises the second network device transmitting the heartbeat protocol message to the first network device at the first frequency. 5. The method of claim 1 , further comprising: subsequent to transmitting the heartbeat protocol message between the first network device and the second network device at the first frequency, determining that no response is received in response to a particular heartbeat protocol message; and retrying transmission of the particular heartbeat protocol message in a first retry message, wherein a time period between transmitting the particular heartbeat protocol message and the first retry message is based on the RTT. 6. The method of claim 1 , further comprising: subsequent to transmitting the heartbeat protocol message between the first network device and the second device at the first frequency, determining that no response is received in response to a particular heartbeat protocol message; retrying transmission of the particular heartbeat protocol message via a first retry message; determining that no response is received in response to the first retry message; retrying transmission of the particular heartbeat protocol message via a second retry message, wherein (a) a first time difference between transmission of the particular heartbeat protocol message and the first retry message is greater than (b) a second time difference between transmission of the first retry message and the second retry message. 7. The method of claim 1 , wherein (a) the first frequency for transmitting the heartbeat protocol message between the first network device and the second network device is different than (b) a second frequency for transmitting the heartbeat protocol message between the first network device and a third network device. 8. A network device comprising: a memory; a processor coupled to the memory, wherein the processor executes a plurality of instructions stored in the memory to: transmit data to another network device at a first time; receive an acknowledgment of receipt of the data from the another network device at a second time; determine a Round Trip Time (RTT) based on the first time and the second time, the Round Trip Time representing a time period from the data transmission to receipt acknowledgment; determine a first frequency that is larger than the RTT; transmit a heartbeat protocol message to the another network device at the first frequency; re-determine the RTT based on current network conditions; and update the first frequency for transmitting the heartbeat protocol message by increasing the first frequency of transmitting the heartbeat protocol message in response to determining a reduction in the RTT or by decreasing the first frequency of transmitting the heartbeat protocol message in response to determining an increase in the RTT, wherein the heartbeat messages represent an active status of the network device and the acknowledgment represents an active status of the another network device. 9. The network device of claim 8 , wherein the processor further executes the plurality of instructions stored in the memory to: transmit the heartbeat protocol message to the another network device at the first frequency. 10. The network device of claim 9 , wherein the another network device transmits the heartbeat protocol message to the network device at a second frequency different than the first frequency. 11. The network device of claim 8 , wherein the another network device transmits the heartbeat protocol message to the network device at the first frequency. 12. The network device of claim 8 , wherein subsequent to transmitting the heartbeat protocol message between the network device and the another network device at the first frequency, the processor further executes the plurality of instructions stored in the memory to: determine that no response is received in response to a particular heartbeat protocol message; retry transmission of the particular heartbeat protocol message in a first retry message, wherein a time period between transmitting the particular heartbeat protocol message and the first retry message is based on the RTT; determine that no response is received in response to a particular heartbeat protocol message; retry transmission of the particular heartbeat protocol message via a first retry message; determine that no response is received in response to the first retry message; and retry transmission of the particular heartbeat protocol message via a second retry message, wherein (a) a first time difference between transmission of the particular heartbeat protocol message and the first retry message is greater than (b) a second time difference between transmission of the first retry message and the second retry message. 13. The network device of claim 8 , wherein (a) the first frequency for transmitting the heartbeat protocol message between the network device and the another network device is different than (b) a second frequency for transmitting the heartbeat protocol message between the network device and a further network device. 14. A network comprising: a cluster having a plurality of network devices; a plurality of client devices; and a plurality of access points controlled by the network devices and connecting the network devices to the client devices, wherein a first one of the plurality of network devices: transmits data to a second one of the plurality of network devices at a first time; receives an acknowledgment of receipt of the data from the second network device at a second time; determines a communication Round Trip Time (RTT) between the first network device and the second network device based

Assignees

Inventors

Classifications

  • in a distributed system consisting of a plurality of standalone computer nodes, e.g. clusters, client-server systems · CPC title

  • by exceeding a time limit, i.e. time-out, e.g. watchdogs · CPC title

  • Round trip delays · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10795745B2 cover?
The present disclosure discloses a method and a network device for failure detection of nodes in a cluster. Specifically, a network device transmits data to another device at a first time. The network device then receives an acknowledgment of the data from the second device at a second time. Next, the network device determines a Round Trip Time (RTT) for the first device and the second device b…
Who is the assignee on this patent?
Hewlett Packard Entpr Dev Lp
What technology area does this patent fall under?
Primary CPC classification G06F11/0757. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 06 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).