Prioritized fault remediation

US12306707B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12306707-B2
Application numberUS-202318159129-A
CountryUS
Kind codeB2
Filing dateJan 25, 2023
Priority dateJan 25, 2023
Publication dateMay 20, 2025
Grant dateMay 20, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, computer program products, and systems are presented. The method computer program products, and systems can include, for instance: iteratively examining logging data; detecting multiple faults in a computer environment in dependence on the examining of the logging data; generating for respective ones of the detected multiple faults one or more candidate remediation to provide a set of candidate remediations for the computer environment; prioritizing remediations defining the set of candidate remediations from the generating and ordering the remediations in a remediation queue according to an order of the prioritizing; and deploying remediations according to the ordering of remediations in the remediation queue.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer implemented method comprising: iteratively examining logging data; detecting multiple faults in a computer environment in dependence on the examining of the logging data; generating for respective ones of the detected multiple faults one or more candidate remediation to provide a set of candidate remediations for the computer environment; prioritizing remediations defining the set of candidate remediations from the generating and ordering the remediations defining the set of candidate remediations in a remediation queue according to an order of the prioritizing; and deploying fault remediations according to the ordering of the remediations defining the set of candidate remediations in the remediation queue, wherein the prioritizing the candidate remediations is performed in dependence on a predicted impact of a certain fault of the detected multiple faults, wherein predicting an impact of the certain fault is in dependence on historical data that comprises key performance indicator (KPI) dataset that specifies KPI changes of a certain microservice through a time window of a certain historical fault. 2. The computer implemented method of claim 1 , wherein the remediation queue includes a first remediation having a higher prioritization order than a second remediation and wherein the first remediation is associated to a fault identified later in time than a fault associated to the second remediation. 3. The computer implemented method of claim 1 , wherein the computer environment comprises a plurality of microservices, wherein the deploying remediations according to the ordering of remediations in the remediation queue includes migrating a first virtual machine of a first microservice of the computer environment to a new physical computing node, and reprovisioning a second virtual machine of a second microservice of the computer environment to increase a memory and CPU allocation to the second virtual machine. 4. The computer implemented method of claim 1 , wherein predicting an impact of the certain fault includes querying a predictive model trained to predict impact of the certain fault on microservices external to a microservice of the certain fault, wherein the predictive model has been trained with iterations of training data which iterations of training data comprise (a) a classifier for a certain microservice of a certain historical fault having a classification in common with the certain fault, (b) a key performance indicator (KPI) dataset that specifies KPI changes of the certain microservice through a time window of the certain historical fault, and (c) a KPI dataset that specifies KPI changes of external microservices external to the certain microservice through the time window of the certain historical fault. 5. The computer implemented method of claim 1 , wherein the prioritizing the candidate remediations is performed in dependence on a predicted cost associated to a certain remediation of the candidate remediations, the certain remediation associated to the certain fault, wherein predicting the impact of the certain fault includes evaluating a plurality of factors including (a) a number of physical computing nodes of a microservice associated to the certain fault, (b) a severity of the certain fault as determined using a decision data structure wherein severity of fault values are assigned based on fault classification, and wherein fault classification is performed based on impact to customers, (c) a key performance indicator (KPI) degradation factor, wherein KPIs associated to a microservice of the certain fault are evaluated, and (d) dependent resources factor, wherein impact of the certain fault on neighboring resources neighboring a resource of the certain fault is predicted using a trained predictive model trained by machine learning, wherein predicting cost associated to a certain remediation of the candidate remediations includes evaluating one or more of the following selected from the group consisting of (i) a classification of the certain remediation which specifies the certain remediation as one of a software based remediation or a hardware based remediation, (ii) a complexity factor, wherein length of time for performance of the certain remediation is ascertained by examining data of a Git data repository, and (iii) an effect factor, wherein an effect of implementing the certain remediation on customers is ascertained. 6. The computer implemented method of claim 1 , wherein the generating to provide the set of candidate remediations includes, for the certain fault of the detected multiple faults using clustering analysis to identify historical faults having a threshold satisfying level of similarity with the certain fault, wherein the performing the clustering analysis includes applying multiple dimensions describing the certain fault on a clustering map with historical faults obtained from a data repository, identifying a set of historical faults having a threshold level of similarity with the certain fault based on Euclidian distance, and discovering applied remediations applied with respect to the identified set of historical faults. 7. The computer implemented method of claim 1 , wherein the generating to provide the set of candidate remediations includes, for the certain fault of the detected multiple faults using clustering analysis to identify historical faults having a threshold satisfying level of similarity with the certain fault, wherein the performing the clustering analysis includes applying first, second and third dimensions describing the certain fault on a clustering map with historical faults obtained from a data repository, identifying a set of historical faults having a threshold level of similarity with the certain fault based on Euclidian distance, and discovering applied remediations applied with respect to the identified set of historical faults, wherein the first dimension describing the certain fault is a first key performance indicator (KPI) parameter value dimension, wherein the second dimension describing the certain fault is a second KPI parameter value dimension, and wherein the third dimension describing the certain fault is a microservices attributes dimension describing a provisioning attribute of a microservice of the certain fault. 8. The computer implemented method of claim 1 , wherein the generating to provide the set of candidate remediations includes, for the certain fault of the detected multiple faults using clustering analysis to identify historical faults having a threshold satisfying level of similarity with the certain fault, wherein the prioritizing the candidate remediations is performed in dependence on a predicted cost associated to a certain remediation of the candidate remediations. 9. The computer implemented method of claim 1 , wherein the detected multiple faults include a first fault detected within a first microservice of the computer environment, and wherein a second fault of the detected multiple faults was detected within a second microservice of the computer environment, wherein the remediation queue includes a first remediation having a higher prioritization order than a second remediation and wherein the first remediation is associated to a fault identified later in time than a fault associated to the second remediation, wherein the generating to provide the set of candidate remediations includes, for the certain fault of the detected multiple faults using clustering analysis to identify historical faults having a threshold satisfying level of similarity with the certain fault, wherein the prioritizing the candidate remediations is performed in dependence on a predicted cost associated to a certain remediation of the candidate remediations.

Assignees

Inventors

Classifications

  • in a distributed system consisting of a plurality of standalone computer nodes, e.g. clusters, client-server systems · CPC title

  • G06N5/022Primary

    Knowledge engineering; Knowledge acquisition · CPC title

  • Remedial or corrective actions (recovery from an exception in an instruction pipeline G06F9/3861; by retry G06F11/1402; for recovering from a failure of a protocol instance or entity H04L69/40) · CPC title

  • Root cause analysis, i.e. error or fault diagnosis (in a hardware test environment G06F11/22; in a software test environment G06F11/36) · CPC title

  • in a virtual computing platform, e.g. logically partitioned systems · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12306707B2 cover?
Methods, computer program products, and systems are presented. The method computer program products, and systems can include, for instance: iteratively examining logging data; detecting multiple faults in a computer environment in dependence on the examining of the logging data; generating for respective ones of the detected multiple faults one or more candidate remediation to provide a set of …
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06N5/022. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 20 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).