Automated Model Based Root Cause Analysis
US-2018032941-A1 · Feb 1, 2018 · US
US12099438B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12099438-B2 |
| Application number | US-202318140712-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 28, 2023 |
| Priority date | Jun 27, 2019 |
| Publication date | Sep 24, 2024 |
| Grant date | Sep 24, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques for monitoring operating statuses of an application and its dependencies are provided. A monitoring application may collect and report the operating status of the monitored application and each dependency. Through use of existing monitoring interfaces, the monitoring application can collect operating status without requiring modification of the underlying monitored application or dependencies. The monitoring application may determine a problem service that is a root cause of an unhealthy state of the monitored application. Dependency analyzer and discovery crawler techniques may automatically configure and update the monitoring application. Machine learning techniques may be used to determine patterns of performance based on system state information associated with performance events and provide health reports relative to a baseline status of the monitored application. Also provided are techniques for testing a response of the monitored application through modifications to API calls. Such tests may be used to train the machine learning model.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method comprising: detecting, by a testing agent and during a testing period associated with a first application, a first call in a computing system from the first application to a first Application Programming Interface (API); modifying, by the testing agent, a response to the first call, wherein the modified response simulates an artificial unhealthy operating status of the first API; returning the modified response to the first application in response to the first call; and determining an impact of the modified response to the first call on the operating status of the first application, wherein a second call to the first API is unaffected by modifying the response to the first call. 2. The method of claim 1 , wherein the modified response simulates an artificial unhealthy operating status of the first API by simulating a result with an artificially high response latency. 3. The method of claim 1 , wherein the modified response simulates an artificial unhealthy operating status of the first API by simulating a result with an artificially high error rate. 4. The method of claim 1 , wherein the modified response simulates an artificial unhealthy operating status of the first API by simulating a result with an artificially high likelihood of non-response. 5. The method of claim 1 , wherein determining the impact of the modified response on the operating status of the first application comprises: determining, by a monitoring application, the operating status of the first application using one or more monitoring interfaces; and determining that the first application has an unhealthy operating status based on at least one metric provided by a first monitoring interface associated with the first application satisfying at least one unhealthy operating status threshold. 6. The method of claim 1 , wherein determining the impact of the modified response on the operating status of the first application comprises: determining that the first application was able to retrieve information associated with the first API from another source. 7. The method of claim 1 , wherein determining the impact of the modified response on the operating status of the first application comprises: determining that the first application was able to partially complete processing despite not receiving the information requested from the first API. 8. The method of claim 1 , further comprising: caching, by the testing agent, the unmodified response to the first call; determining, by a monitoring application, whether the first application was able to recover from the modified response to the first call; and based on determining that the first application was not able to recover, causing the computing system to return the unmodified response to the first call to the first application. 9. A non-transitory computer readable medium storing instructions that, when executed by one or more processors, cause a computing device to: detect, by a testing agent and during a testing period associated with a first application, a first call in a computing system from the first application to a first Application Programming Interface (API); modify, by the testing agent, a response to the first call, wherein the modified response simulates an artificial unhealthy operating status of the first API; return the modified response to the first application in response to the first call; and determine an impact of the modified response to the first call on the operating status of the first application, wherein a second call to the first API is unaffected by modifying the response to the first call. 10. The non-transitory computer readable medium of claim 9 , wherein the modified response simulates an artificial unhealthy operating status of the first API by simulating a result with an artificially high response latency. 11. The non-transitory computer readable medium of claim 9 , wherein the modified response simulates an artificial unhealthy operating status of the first API by simulating a result with an artificially high error rate. 12. The non-transitory computer readable medium of claim 9 , wherein the modified response simulates an artificial unhealthy operating status of the first API by simulating a result with an artificially high likelihood of non-response. 13. The non-transitory computer readable medium of claim 9 , wherein the instructions cause the computing device to determine the impact of the modified response on the operating status of the first application by causing the computing device to: determine, by a monitoring application, the operating status of the first application using one or more monitoring interfaces; and determine that the first application has an unhealthy operating status based on at least one metric provided by a first monitoring interface associated with the first application satisfying at least one unhealthy operating status threshold. 14. The non-transitory computer readable medium of claim 9 , wherein the instructions cause the computing device to determine the impact of the modified response on the operating status of the first application by causing the computing device to: determine that the first application was able to retrieve information associated with the first API from another source. 15. The non-transitory computer readable medium of claim 9 , wherein the instructions cause the computing device to determine the impact of the modified response on the operating status of the first application by causing the computing device to: determine that the first application was able to partially complete processing despite not receiving the information requested from the first API. 16. The non-transitory computer readable medium of claim 9 , wherein the instructions further cause the computing device to: cache, by the testing agent, the unmodified response to the first call; determine, by a monitoring application, whether the first application was able to recover from the modified response to the first call; and based on determining that the first application was not able to recover, cause the computing device to return the unmodified response to the first call to the first application. 17. A monitoring system comprising: one or more processors; and memory storing instructions that, when executed by the one or more processors, cause the monitoring system to: detect, by a testing agent and during a testing period associated with a first application, a first call in a computing system from the first application to a first Application Programming Interface (API); modify, by the testing agent, a response to the first call, wherein the modified response simulates an artificial unhealthy operating status of the first API; return the modified response to the first application in response to the first call; and determine an impact of the modified response to the first call on the operating status of the first application, wherein a second call to the first API is unaffected by modifying the response to the first call. 18. The monitoring system of claim 17 , wherein the instructions further cause the monitoring system to: cache, by the testing agent, the unmodified response to the first call; determine, by a monitoring application, whether the first application was able to recover from the modified response to the first call; and based on determining that the first application was not able to recover, cause the monitoring system to return the unmodified response to the first call to the first app
for test execution, e.g. scheduling of test suites · CPC title
where the computing system component is a software system · CPC title
Message passing systems or structures, e.g. queues · CPC title
for systems · CPC title
for test design, e.g. generating new test cases · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.