Dynamic adaptive approach for failure detection of node in a cluster

US9842013B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9842013-B2
Application numberUS-201414524698-A
CountryUS
Kind codeB2
Filing dateOct 27, 2014
Priority dateOct 27, 2014
Publication dateDec 12, 2017
Grant dateDec 12, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure discloses a method and a network device for failure detection of nodes in a cluster. Specifically, a network device transmits data to another device at a first time. The network device then receives an acknowledgment of the data from the second device at a second time. Next, the network device determines a Round Trip Time (RTT) for the first device and the second device based on the first time and the second time. Based on the RTT, the network device determines a first frequency for transmitting a heartbeat protocol message between the first device and the second device, and transmits a heartbeat protocol message between the first device and the second device at the first frequency.

First claim

Opening claim text (preview).

What is claimed is: 1. A non-transitory computer readable medium comprising instructions which, when executed by one or more hardware processors, causes performance of operations comprising: transmitting, by a first device; data to a second device at a first time; receiving, by the first device, an acknowledgment of the data from the second device at a second time; determining a Round Trip Time (RTT) for the first device and the second device based on the first time and the second time; determining a first frequency for transmitting a heartbeat protocol message between the first device and the second device based on the RTT, wherein the first frequency is larger than the RTT; and transmitting a heartbeat protocol message between the first device and the second device at the first frequency. 2. The non-transitory computer readable medium of claim 1 , wherein the operations further comprise: at least periodically performing: re-measuring the RTT for the first device and the second device based on current network conditions; dynamically updating the first frequency for transmitting the heartbeat protocol message between the first device and the second device. 3. The non-transitory computer readable medium of claim 2 , wherein dynamically updating the first frequency of transmitting the heartbeat protocol message comprises increasing the first frequency of transmitting the heartbeat protocol message in response to determining a reduction in the RTT. 4. The non-transitory computer readable medium of claim 2 , wherein dynamically updating the first frequency of transmitting the heartbeat protocol message comprises decreasing the first frequency of transmitting the heartbeat protocol message in response to determining an increase in the RTT. 5. The non-transitory computer readable medium of claim 1 , wherein transmitting the heartbeat protocol message between the first device and the second device at the first frequency comprises the first device transmitting the heartbeat protocol message to the second device at the first frequency. 6. The non-transitory computer readable medium of claim 5 , wherein the operations further comprise the second device transmits the heartbeat protocol message to the first device at a second frequency different than the first frequency. 7. The non-transitory computer readable medium of claim 1 , wherein transmitting the heartbeat protocol message between the first device and the second device at the first frequency comprises the second device transmitting the heartbeat protocol message to the first device at the first frequency. 8. The non-transitory computer readable medium of claim 1 , subsequent to transmitting the heartbeat protocol message between the first device and the second device at the first frequency: determining that no response is received in response to a particular heartbeat protocol message; retrying transmission of the particular heartbeat protocol message in a first retry message, wherein a time period between transmitting the particular heartbeat protocol message and the first retry message is based on the RTT. 9. The non-transitory computer readable medium of claim 1 , subsequent to transmitting the heartbeat protocol message between the first device and the second device at the first frequency: determining that no response is received in response to a particular heartbeat protocol message; retrying transmission of the particular heartbeat protocol message via a first retry message; determining that no response is received in response to the first retry message; retrying transmission of the particular heartbeat protocol message via a second retry message, wherein (a) a first time difference between transmission of the particular heartbeat protocol message and the first retry message is greater than (b) a second time difference between transmission of the first retry message and the second retry message. 10. The non-transitory computer readable medium of claim 1 , wherein (a) the first frequency for transmitting the heartbeat protocol message between the first device and the second device is different than (b) a second frequency for transmitting the heartbeat protocol message between the first device and a third device. 11. A system comprising: a first device and a second device each comprising: a processor; and a memory storing computer readably: instructions to cause the processor to: transmit by the first device, data to the second device at a first time; received by the first device, an acknowledgment of the data from the second device at a second time; determine a Round Trip Time (RTT) for the first device and the second device based on the first time and the second time; determine a first frequency for transmitting a heartbeat protocol message between the first device and the second device based on the RTT, wherein the first frequency is a multiple of the RTT; transmit a heartbeat protocol message between the first device and the second device at the first frequency. 12. The system of claim 11 , comprising instructions to cause the processor to: re-measure the RTT for the first device and the second device based on current network conditions; and dynamically update the first frequency for transmitting the heartbeat protocol message between the first device and the second device. 13. The system of claim 12 , wherein dynamically updating the first frequency of transmitting the heartbeat protocol message comprises increasing the first frequency of transmitting the heartbeat protocol message in response to determining a reduction in the RTT. 14. The system of claim 12 , wherein dynamically updating the first frequency of transmitting the heartbeat protocol message comprises decreasing the first frequency of transmitting the heartbeat protocol message in response to determining an increase in the RTT. 15. The system of claim 11 , wherein transmitting the heartbeat protocol message between the first device and the second device at the first frequency comprises the first device transmitting the heartbeat protocol message to the second device at the first frequency. 16. The system of claim 15 , wherein the operations further comprise the second device transmits the heartbeat protocol message to the first device at a second frequency different than the first frequency. 17. The system of claim 11 , wherein transmitting the heartbeat protocol message between the first device and the second device at the first frequency comprises the second device transmitting the heartbeat protocol message to the first device at the first frequency. 18. The system of claim 11 , subsequent to transmitting the heartbeat protocol message between the first device and the second device at the first frequency: determining that no response is received in response to the heartbeat protocol message; retrying transmission of the heartbeat protocol message, wherein a time period between transmitting the heartbeat protocol message is based on the RTT. 19. The system of claim 11 , subsequent to transmitting the heartbeat protocol message between the first device and the second device at the first frequency: determining that no response is received in response to the heartbeat protocol message; retrying transmission of the heartbeat protocol message via a first retry message; determining that no response is received in response to the first retry message; retrying transmission of the heartbeat protocol message via a second retry message, wherein (a) a first time difference between transmiss

Assignees

Inventors

Classifications

  • in a distributed system consisting of a plurality of standalone computer nodes, e.g. clusters, client-server systems · CPC title

  • by exceeding a time limit, i.e. time-out, e.g. watchdogs · CPC title

  • Round trip delays · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9842013B2 cover?
The present disclosure discloses a method and a network device for failure detection of nodes in a cluster. Specifically, a network device transmits data to another device at a first time. The network device then receives an acknowledgment of the data from the second device at a second time. Next, the network device determines a Round Trip Time (RTT) for the first device and the second device b…
Who is the assignee on this patent?
Aruba Networks Inc
What technology area does this patent fall under?
Primary CPC classification G06F11/0757. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 12 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).