Updating of troubleshooting assistants
US-2016077947-A1 · Mar 17, 2016 · US
US2018270102A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2018270102-A1 |
| Application number | US-201715459879-A |
| Country | US |
| Kind code | A1 |
| Filing date | Mar 15, 2017 |
| Priority date | Mar 15, 2017 |
| Publication date | Sep 20, 2018 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
One or more processors of a device execute instructions to identify a set of servers that includes a first server and a second server in a plurality of data centers; send a first list of servers to the first server; send a second list of servers to the second server; receive a first set of response data from the first server, the first set of response data indicating responsiveness of the servers in the first list of servers; receive a second set of response data from the second server, the second set of response data indicating responsiveness of the servers in the second list of servers; analyze the first set of response data and the second set of response data; and based on the analysis, generate an alert that indicates a network error in a data center.
Opening claim text (preview).
What is claimed is: 1 . A device comprising: a memory storage comprising instructions; a network interface connected to a network; and one or more processors in communication with the memory storage, wherein the one or more processors execute the instructions to perform: identifying a set of servers in a plurality of data centers, the set of servers including a first server and a second server; sending, via the network interface, a first list of servers in the set of servers to the first server; sending, via the network interface, a second list of servers in the set of servers to the second server; receiving, via the network interface, a first set of response data from the first server, the first set of response data indicating responsiveness of the servers in the first list of servers; receiving, via the network interface, a second set of response data from the second server, the second set of response data indicating responsiveness of the servers in the second list of servers; analyzing the first set of response data and the second set of response data; and based on the analysis, generating an alert that indicates a network error in a data center of the plurality of data centers. 2 . The device of claim 1 , wherein the analyzing of the first set of response data and the second set of response data comprises: determining a drop rate for a third server in the first list of servers. 3 . The device of claim 1 , wherein the analyzing of the first set of response data and the second set of response data comprises: determining a failure state for a third server in the first list of servers. 4 . The device of claim 1 , wherein the analyzing of the first set of response data and the second set of response data comprises: using a tree data structure in which each leaf node of the tree corresponds to a server of the set of servers; and determining that all servers in the set of servers corresponding to sibling nodes of a node corresponding to a third server in the set of servers report dropped packets to the third server. 5 . The device of claim 1 , wherein the analyzing of the first set of response data and the second set of response data comprises: using a tree data structure in which each leaf node of the tree corresponds to a server of the set of servers and each other node of the tree corresponds to a distinct subset of the set of servers; and determining that a node in the tree data structure and all children of the node are in a failure state. 6 . The device of claim 1 , wherein the analyzing of the first set of response data and the second set of response data comprises: using a tree data structure in which each leaf node of the tree corresponds to a server of the set of servers and each other node of the tree corresponds to a distinct subset of the set of servers; and determining that a node in the tree is in a failure state and that no children of the node are in the failure state. 7 . The device of claim 1 , wherein the analyzing of the first set of response data and the second set of response data comprises: using a tree data structure in which each leaf node of the tree corresponds to a server of the set of servers and each other node of the tree corresponds to a distinct subset of the set of servers; and determining that a node is not in a failure state and that at least one child of the node is in the failure state. 8 . The device of claim 1 , wherein the one or more processors further perform: creating the first list of servers by including each server in a same rack as the first server. 9 . The device of claim 1 , wherein the one or more processors further perform: creating the first list of servers by including a third server, based on the third server being in a different rack than the first server. 10 . The device of claim 1 , wherein the one or more processors further perform: creating the first list of servers by including a third server, based on the third server being in a different data center than the first server. 11 . A computer-implemented method for automated fault detection in data center networks comprising: identifying, by one or more processors, a set of servers in a plurality of data centers, the set of servers including a first server and a second server; sending, via a network interface, a first list of servers in the set of servers to the first server; sending, via the network interface, a second list of servers in the set of servers to the second server; receiving, via the network interface, a first set of response data from the first server, the first set of response data indicating responsiveness of the servers in the first list of servers; receiving, via the network interface, a second set of response data from the second server, the second set of response data indicating responsiveness of the servers in the second list of servers; analyzing, by the one or more processors, the first set of response data and the second set of response data; and based on the analysis, generating an alert that indicates a network error in a data center of the plurality of data centers. 12 . The computer-implemented method of claim 11 , wherein the analyzing of the first set of response data and the second set of response data comprises: determining a drop rate for a third server in the first list of servers. 13 . The computer-implemented method of claim 11 , wherein the analyzing of the first set of response data and the second set of response data comprises: determining a failure state for a third server in the first list of servers. 14 . The computer-implemented method of claim 11 , wherein the analyzing of the first set of response data and the second set of response data comprises: using a tree data structure in which each leaf node of the tree corresponds to a server of the set of servers; and determining that all servers in the set of servers corresponding to sibling nodes of a node corresponding to a third server in the set of servers report dropped packets to the third server. 15 . The computer-implemented method of claim 11 , wherein the analyzing of the first set of response data and the second set of response data comprises: using a tree data structure in which each leaf node of the tree corresponds to a server of the set of servers and each other node of the tree corresponds to a distinct subset of the set of servers; and determining that a node in the tree data structure and all children of the node are in a failure state. 16 . The computer-implemented method of claim 11 , wherein the analyzing of the first set of response data and the second set of response data comprises: using a tree data structure in which each leaf node of the tree corresponds to a server of the set of servers and each other node of the tree corresponds to a distinct subset of the set of servers; and determining that a node in the tree is in a failure state and that no children of the node are in the failure state. 17 . The computer-implemented method of claim 11 , wherein the analyzing of the first set of response data and the second set of response data comprises: using a tree data structure in which each leaf node of the tree corresponds to a server of the set of servers and each other node of the tree corresponds to a distinct subset of the set of servers; and determining that a node is not in a failure state and that at least one child of the node is in the failure state. 18 . A non-transitory computer-readable medium storing computer instructions for a
in which an application is distributed across nodes in the network (software deployment G06F8/60; multiprogramming arrangements G06F9/46) · CPC title
using logs of notifications; Post-processing of notifications · CPC title
by checking connectivity · CPC title
Errors, e.g. transmission errors · CPC title
Management of faults, events, alarms or notifications · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.