Systems and methods for monitoring application health in a distributed architecture

US2022012143A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2022012143-A1
Application numberUS-202016925862-A
CountryUS
Kind codeA1
Filing dateJul 10, 2020
Priority dateJul 10, 2020
Publication dateJan 13, 2022
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computing device configured for monitoring and analyzing health of a distributed computer system having a plurality of interconnected system components. The computing device tracks communication between the system components and monitors for an alert indicating an error in the communication in the distributed computer system. In response to the error, the computing device receives a health log from each of the system components defining an aggregate health log being in a standardized format indicating messages communicated between the system components. The computing device further receives network infrastructure information defining relationships between the system components and characterizing dependency information; and, automatically determines, based on the aggregate health log and the network infrastructure information, a particular component originating the error and associated dependent components from the system components affected.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computing device for monitoring and analyzing health of a distributed computer system having a plurality of interconnected system components, the computing device having a processor coupled to a memory, the memory storing instructions which when executed by the processor configure the computing device to: track communication between the system components and monitor for an alert indicating an error in the communication in the distributed computer system, upon detecting the error: receive a health log from each of the system components together defining an aggregate health log, each health log being in a standardized format indicating messages communicated between the system components; receive, from a data store, network infrastructure information defining one or more relationships for connectivity and communication flow between the system components, the relationships characterizing dependency information between the system components; and automatically determine, based on the aggregate health log and the network infrastructure information, a particular component of the system components originating the error and associated dependent components from the system components affected. 2 . The device of claim 1 , wherein each health log further comprises a common identifier for tracing a route of the messages communicated for a transaction having the error. 3 . The device of claim 2 , further configured to obtain health monitoring rules comprising data integrity information for pre-defined communications between the system components from the data store, the health monitoring rules for verifying whether each of the health logs complies with the data integrity information. 4 . The device of claim 3 , further comprising the health monitoring rules further defined based on historical error patterns for the distributed computer system associating a set of traffic flows for the messages between the system components and potentially occurring in each of the health logs to a corresponding error type. 5 . The device of claim 2 , further configured to: determine from the dependency information indicating which of the system components are dependent on one another for operations performed in the distributed computer system, an impact of the error originated by the particular component on the associated dependent components. 6 . The device of claim 5 , further comprising upon detecting the alert: displaying the alert on a user interface of a client application for the device, the alert based on the particular component originating the error determined from the aggregate health log. 7 . The device of claim 6 , further comprising: displaying on the user interface along with the alert, the associated dependent components to the particular component. 8 . The device of claim 1 , wherein the standardized format comprises a JSON format. 9 . The device of claim 1 , wherein the system components are APIs (application programming interfaces) on one or more connected computing devices and the health log is an API log for logging activity for the respective API in communication with other APIs and related to the error. 10 . The device of claim 1 , wherein the processor configuring the computing device to automatically determine origination of the error further comprises: comparing each of the health logs in the aggregate health log to the other health logs in response to the relationships in the network infrastructure information. 11 . A method implemented by a computing device, the method for monitoring and analyzing health of a distributed computer system having a plurality of interconnected system components, the method comprising: tracking communication between the system components and monitor for an alert indicating an error in the communication in the distributed computer system, upon detecting the error: receiving a health log from each of the system components together defining an aggregate health log, each health log being in a standardized format indicating messages communicated between the system components; receiving, from a data store, network infrastructure information defining one or more relationships for connectivity and communication flow between the system components, the relationships characterizing dependency information between the system components; and automatically determining, based on the aggregate health log and the network infrastructure information, a particular component of the system components originating the error and associated dependent components from the system components affected. 12 . The method of claim 11 wherein each health log further comprises a common identifier for tracing a route of the messages communicated for a transaction having the error. 13 . The method of claim 12 , further comprising obtaining health monitoring rules comprising data integrity information for pre-defined communications between the system components from the data store, the health monitoring rules for verifying whether each of the health logs complies with the data integrity information. 14 . The method of claim 13 , further comprising the health monitoring rules further defined based on historical error patterns for the distributed computer system associating a set of traffic flows for the messages between the system components and potentially occurring in each of the health logs to a corresponding error type. 15 . The method of claim 12 , further comprising: determining from the dependency information indicating which of the system components are dependent on one another for operations performed in the distributed computer system, an impact of the error originated by the particular component on the associated dependent components. 16 . The method of claim 15 , further comprising upon detecting the alert: displaying the alert on a user interface of a client application for the device, the alert based on the particular component originating the error determined from the aggregate health log. 17 . The method of claim 16 , further comprising: displaying on the user interface along with the alert, the associated dependent components to the particular component. 18 . The method of claim 11 , wherein the standardized format comprises a JSON format. 19 . The method of claim 11 , wherein the system components are APIs (application programming interfaces) on one or more connected computing devices and the health log is an API log for logging activity for the respective API in communication with other APIs and related to the error. 20 . The method of claim 11 , wherein automatically determining origination of the error further comprises: comparing each of the health logs in the aggregate health log to the other health logs in response to the relationships in the network infrastructure information. 21 . A computer readable medium comprising a non-transitory device storing instructions and data, which when executed by a processor of a computing device, the processor coupled to a memory, configure the computing device to: track communication between system components of a distributed computer system having a plurality of interconnected system components and monitor for an alert indicating an error in the communication in the distributed computer system, upon detecting the error: receive a health log from each of the system components together defining an aggregate health log, each health log being in a standardized format indicating messages

Assignees

Inventors

Classifications

  • Monitoring of systems including the internet · CPC title

  • Data logging (G06F11/14, G06F11/2205 take precedence) · CPC title

  • G06F11/079Primary

    Root cause analysis, i.e. error or fault diagnosis (in a hardware test environment G06F11/22; in a software test environment G06F11/36) · CPC title

  • where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems (multiprogramming arrangements G06F9/46; allocation of resources G06F9/50) · CPC title

  • Means for error signaling, e.g. using interrupts, exception flags, dedicated error registers · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2022012143A1 cover?
A computing device configured for monitoring and analyzing health of a distributed computer system having a plurality of interconnected system components. The computing device tracks communication between the system components and monitors for an alert indicating an error in the communication in the distributed computer system. In response to the error, the computing device receives a health lo…
Who is the assignee on this patent?
Toronto Dominion Bank
What technology area does this patent fall under?
Primary CPC classification G06F11/079. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jan 13 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).