Detecting high availability readiness of a distributed computing system

US10055268B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10055268-B2
Application numberUS-201615277959-A
CountryUS
Kind codeB2
Filing dateSep 27, 2016
Priority dateOct 14, 2014
Publication dateAug 21, 2018
Grant dateAug 21, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Technology is disclosed for determining high availability readiness of a distributed computing system (“system”). A confidence measure (CM) can be computed for a particular controller in the system to determine whether a takeover by the particular controller from a first controller would be successful. The CM can be a percentage value. A CM of 0% indicates that a takeover would be a failure, which results in loss of access to data managed by the first controller. A CM of 100% indicates a successful takeover with no performance impact on the system. A CM between 0% and 100% indicates a successful takeover but with a performance impact. The CM can be computed based on events occurring in the system, e.g., veto and non-veto events. The CM is computed as a function of various weights and/or indices associated with the veto events and/or non-veto events.

First claim

Opening claim text (preview).

We claim: 1. A method, comprising: receiving, by a computing device, a list of historical events related to a high availability pair comprising a first node and a second node of a distributed computing system; determining, by the computing device, a set of non-veto events and a set of veto events related to at least one of the nodes from the list of historical events; obtaining, by the computing device, a severity index and a compliance factor for each event of the set of non-veto events; and generating and outputting, by the computing device, a confidence measure for the at least one of the nodes based on the set of veto events, the severity index, and the compliance factor. 2. The method of claim 1 , wherein the confidence measure indicates a magnitude of an impact on a performance of the distributed computing system if the at least one of the nodes takes over from another one of the nodes. 3. The method of claim 1 , wherein the severity index of an event of the set of non-veto events indicates a magnitude of performance impact on the distributed computing system due to the occurrence of the event. 4. The method of claim 1 , wherein the compliance factor an event of the set of non-veto events indicates a deviation of the event from an expected behavior of the event. 5. The method of claim 1 , wherein the set of non-veto events comprises events that have an adverse impact on computing resources of the distributed computing system upon takeover by the at least one of the nodes of another one of the nodes and the set of non-veto events excludes events that cause the takeover to fail. 6. The method of claim 1 further comprising: determining, by the computing device, when the confidence measure exceeds a specified threshold; and generating and outputting, by the computing device, a notification indicating a value of the confidence measure, when the determining indicates that the confidence measure exceeds the specified threshold, wherein the notification includes a list of tasks to be performed to change the confidence measure with respect to the specified threshold. 7. A computing device, comprising: a memory containing machine readable medium comprising machine executable code having stored thereon instructions for detecting high availability readiness; and a processor coupled to the memory, the processor configured to execute the machine executable code to cause the processor to: receive a list of historical events related to a high availability pair comprising a first node and a second node of a distributed computing system; determine a set of non-veto events and a set of veto events related to at least one of the nodes from the list of historical events; obtain a severity index and a compliance factor for each event of the set of non-veto events; and generate and output a confidence measure for the at least one of the nodes based on the set of veto events, the severity index, and the compliance factor. 8. The computing device of claim 7 , wherein the confidence measure indicates a magnitude of an impact on a performance of the distributed computing system if the at least one of the nodes takes over from another one of the nodes. 9. The computing device of claim 7 , wherein the severity index of an event of the set of non-veto events indicates a magnitude of performance impact on the distributed computing system due to the occurrence of the event. 10. The computing device of claim 7 , wherein the compliance factor an event of the set of non-veto events indicates a deviation of the event from an expected behavior of the event. 11. The computing device of claim 7 , wherein the set of non-veto events comprises events that have an adverse impact on computing resources of the distributed computing system upon takeover by the at least one of the nodes of another one of the nodes and the set of non-veto events excludes events that cause the takeover to fail. 12. The computing device of claim 7 wherein the processor is further configured to execute the machine executable code to cause the processor to: determine when the confidence measure exceeds a specified threshold; and generate and output a notification indicating a value of the confidence measure, when the determining indicates that the confidence measure exceeds the specified threshold, wherein the notification includes a list of tasks to be performed to change the confidence measure with respect to the specified threshold. 13. A non-transitory machine readable medium having stored thereon instructions for detecting high availability readiness comprising executable code which when executed by at least one machine, causes the machine to: receive a list of historical events related to a high availability pair comprising a first node and a second node of a distributed computing system; determine a set of non-veto events and a set of veto events related to at least one of the nodes from the list of historical events; obtain a severity index and a compliance factor for each event of the set of non-veto events; and generate and output a confidence measure for the at least one of the nodes based on the set of veto events, the severity index, and the compliance factor. 14. The non-transitory machine readable medium of claim 13 , wherein the confidence measure indicates a magnitude of an impact on a performance of the distributed computing system if the at least one of the nodes takes over from another one of the nodes. 15. The non-transitory machine readable medium of claim 13 , wherein the severity index of an event of the set of non-veto events indicates a magnitude of performance impact on the distributed computing system due to the occurrence of the event. 16. The non-transitory machine readable medium of claim 13 , wherein the compliance factor an event of the set of non-veto events indicates a deviation of the event from an expected behavior of the event. 17. The non-transitory machine readable medium of claim 13 , wherein the set of non-veto events comprises events that have an adverse impact on computing resources of the distributed computing system upon takeover by the at least one of the nodes of another one of the nodes and the set of non-veto events excludes events that cause the takeover to fail. 18. The non-transitory machine readable medium of claim 13 wherein the executable code, when executed by the machine, further causes the machine to: determine when the confidence measure exceeds a specified threshold; and generate and output a notification indicating a value of the confidence measure, when the determining indicates that the confidence measure exceeds the specified threshold, wherein the notification includes a list of tasks to be performed to change the confidence measure with respect to the specified threshold.

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10055268B2 cover?
Technology is disclosed for determining high availability readiness of a distributed computing system (“system”). A confidence measure (CM) can be computed for a particular controller in the system to determine whether a takeover by the particular controller from a first controller would be successful. The CM can be a percentage value. A CM of 0% indicates that a takeover would be a failure, wh…
Who is the assignee on this patent?
Netapp Inc
What technology area does this patent fall under?
Primary CPC classification G06F11/008. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 21 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).