Automating the production of runbook workflows
US-9891971-B1 · Feb 13, 2018 · US
US10275331B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-10275331-B1 |
| Application number | US-201816201660-A |
| Country | US |
| Kind code | B1 |
| Filing date | Nov 27, 2018 |
| Priority date | Nov 27, 2018 |
| Publication date | Apr 30, 2019 |
| Grant date | Apr 30, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Disclosed are hardware and techniques for testing computer processes in a network system by simulating computer process faults and identifying risk associated with correcting the simulated fault and identifying computer processes that may depend on the corrected computer process. The interdependent computer processes in a network may be determined by evaluating a risk matrix having a risk score and non-functional requirement scores. An analysis of the risk score and non-functional requirement score accounts for interdependencies between computer processes and identified corrective actions that may be used to determine an optimal network environment. The optimal network environment may be updated dynamically based on changing computer process interdependencies and the determined risk and robustness scores.
Opening claim text (preview).
What is claimed is: 1. A method, comprising: monitoring, by a monitoring component, a simulation instance of a plurality of computer-implemented processes operating in a network environment of an enterprise; generating a pre-breakage snapshot of a process health of each of the plurality of computer-implemented processes, wherein the pre-breakage snapshot, for each computer-implemented process of the plurality of computer-implemented processes, includes: a robustness score indicating a process health of each computer-implemented process of the plurality of computer-implemented processes, and a risk score indicating a threshold between automated correction and manual correction of a degrading system; generating, by a simulation processing component executing the simulation instance, a simulated break event flag indicating a process volatility in a test computer-implemented process of a plurality of computer-implemented processes; generating, by the monitoring component in response to the simulated break event flag generated by the simulation processing component, a simulation result snapshot of process health of each of the plurality of computer-implemented processes, wherein the simulation result snapshot includes an updated robustness score and an updated risk score for each computer-implemented process of the plurality of computer-implemented processes; accessing, by a rules engine processing component, a library of runbooks, wherein: each runbook in the library of runbooks addresses a respective computer-implemented process of the plurality of computer-implemented processes operating in the network, and each respective runbook includes a plurality of response strategies, wherein each final response strategy of the plurality of response strategies is implementable to cure specific process volatilities of the respective computer-implemented process addressed by the respective runbook; identifying, based on the simulated break event flag, a specific runbook in the library of runbooks that addresses process volatilities of the test computer-implemented process; locating a final response strategy in the specific runbook that cures the indicated process volatility of the test computer-implemented process; selecting the located final response to cure the indicated process volatility of the test computer-implemented process to be implemented in the simulation instance of the network environment; simulating, by the simulation processing component, implementation of the located final response strategy in the network environment to cure the indicated process volatility of the test computer-implemented process; generating, by the monitoring component in response to the simulated implementation of the final response strategy, a cure result snapshot of process health of each of the plurality of computer-implemented processes; evaluating the pre-breakage snapshot, the simulation result snapshot, and the cure result snapshot with reference to one another; and based on results of the evaluation, identifying a network environment architecture as an optimal network architecture that cures the process volatility of the test computer-implemented process, wherein the optimal network architecture has a below-threshold risk score for each of the plurality of computer-implemented processes operated by the enterprise and an above-threshold robustness score for each of the plurality of computer-implemented processes. 2. The method of claim 1 , wherein generating the pre-breakage snapshot comprises: receiving, from the monitoring component coupled to each of the computer-implemented processes in the plurality of computer-implemented processes, a list of break event flags for each computer-implemented process of the plurality of computer-implemented processes; identifying respective break event symptoms for each of the break event flags in the list of break event flags, generating for each identified respective break event symptom a computing environment indicator identifying the respective break event symptom, a code environment indicator identifying the respective break event symptom, and a response strategy corresponding to the respective break event symptom; generating, by the rules engine, a robustness score for each respective computer-implemented process of the plurality of computer-implemented processes, wherein the robustness score for each respective computer-implemented process is based on the identified break event symptom, the computing environment indicator, the code environment indicator, the respective break event symptom corresponding to the respective computer-implemented process and the response strategy corresponding to the respective break event symptom of the respective computer-implemented process; generating, by the rules engine, a risk score for each computer-implemented process of the plurality of computer-implemented processes based on the identified break event symptom, and the response strategy corresponding to the respective break event symptom of the respective computer-implemented process; and storing the generated robustness and risk scores of each computer-implemented process with a timestamp of when the pre-breakage snapshot was taken in a data structure. 3. The method of claim 1 , wherein generating the simulation result snapshot after application of located final response comprises: in response to application of the located final response to the simulation instance, generating based on inputs received from the monitoring component a list of break event flags for each computer-implemented process of the plurality of computer-implemented processes, wherein the monitoring component monitors the simulation instance via a coupling to each of the computer-implemented processes in the plurality of computer-implemented processes; identifying respective break event symptoms for all the break event flags in the list of break event flags, determining, for each identified respective break event symptom, a computing environment indicator corresponding to the respective break event symptom, a code environment indicator corresponding to the respective break event symptom, and a final response strategy corresponding to the respective break event symptom; storing the break event symptom, the determined computing environment indicator, the determined code environment indicator and determined fix event into a data structure; generating, by the rules engine processing circuit, a simulation robustness score for each respective computer-implemented process of the plurality of computer-implemented processes based on the identified break event symptom, the determined computing environment indicator, the determined code environment indicator, the break event symptom corresponding to the respective computer-implemented process and the final response strategy corresponding to the break event symptom of the respective computer-implemented process; generating, by the rules engine processing circuit, a risk score for each computer-implemented process of the plurality of computer-implemented processes based on the identified break event symptom, and the final response strategy corresponding to the break event symptom of the respective computer-implemented process; and storing the generated robustness and risk scores of each computer-implemented process with a timestamp indicating when the simulation result snapshot was taken in the data structure. 4. The method of claim 1 , further comprising: in response to applying the located final response to the simulation instance of the network environment, generating a modified robustness score of the updated robustness score and a modified risk score of the updated risk score for each computer-implemented process of the plurality of computer-implemented processes; and storing each of th
Performance evaluation by simulation · CPC title
Root cause analysis, i.e. error or fault diagnosis (in a hardware test environment G06F11/22; in a software test environment G06F11/36) · CPC title
Remedial or corrective actions (recovery from an exception in an instruction pipeline G06F9/3861; by retry G06F11/1402; for recovering from a failure of a protocol instance or entity H04L69/40) · CPC title
in a distributed system consisting of a plurality of standalone computer nodes, e.g. clusters, client-server systems · CPC title
by simulating additional hardware, e.g. fault simulation · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.