What technology area does this patent fall under?

Primary CPC classification G06F11/0712. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jan 07 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Integrated statistical log data mining for mean time auto-resolution

US10528407B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10528407-B2
Application number	US-201715701468-A
Country	US
Kind code	B2
Filing date	Sep 12, 2017
Priority date	Jul 20, 2017
Publication date	Jan 7, 2020
Grant date	Jan 7, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method may include generating, by a diagnosis manager, a plurality of pre-processed files based on a plurality of log files containing operational information related to one or more of the plurality of modules operating in the cloud environment. The method may include generating a set of weightage matrices based on a plurality of tokens extracted from the plurality of pre-processed files, and identifying a plurality of clusters based on the set of weightage matrices. The method may further include determining, by a resolution manager coupled with the diagnosis manager, an operational issue for a specific module selected from the plurality of modules and associated with a specific cluster selected from the plurality of clusters, based on the subset of tokens associated with the specific cluster; and performing a predefined action on the specific module based on the operational issue.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for automatically diagnosing and resolving operational issues in a cloud environment, the method comprising: collecting, by a diagnosis manager, a plurality of log files generated by a plurality of modules operating in the cloud environment, wherein each of the plurality of log files contains operational information related to one or more of the plurality of modules; generating, by the diagnosis manager, a set of weightage matrices based on a plurality of tokens extracted from the plurality of log files; generating, by the diagnosis manager, a plurality of nodes corresponding to the plurality of modules, wherein each of the plurality of nodes is associated with one or more tokens selected from the plurality of tokens; identifying, by the diagnosis manager, a plurality of clusters from the plurality of nodes based on the set of weightage matrices, wherein each of the plurality of clusters includes a subset of nodes selected from the plurality of nodes and is associated with a representative keyword including one or more tokens that represent contents of the subset of nodes; and determining, by a resolution manager coupled with the diagnosis manager, an operational issue for a specific module selected from the plurality of modules and associated with a specific cluster selected from the plurality of clusters, based on the corresponding representative keyword associated with the specific cluster. 2. The method as recited in the claim 1 , wherein the method further comprises: performing, by the resolution manager, a predefined action on the specific module based on the operational issue. 3. The method as recited in the claim 1 , wherein the generating of the set of weightage matrices comprises: for a log file selected from the plurality of log files, identifying a plurality of words in the log file; extracting one or more tokens from the plurality of words after removing stop-words from and performing stemming on the plurality of words; and including the one or more tokens in the plurality of tokens. 4. The method as recited in the claim 1 , wherein the generating of the set of weightage matrices comprises: generating a corresponding token-frequency for each of the plurality of tokens; generating a corresponding inverse-document-frequency for each of the plurality of unique tokens; and generating a corresponding token-weightage for each of the plurality of tokens based on the corresponding token-frequency and the corresponding inverse-document-frequency. 5. The method as recited in the claim 4 , wherein the generating of the set of weightage matrices further comprises: selecting a subset of tokens from the plurality of tokens based on their corresponding token-weightages; constructing the set of weightage matrices based on the subset of tokens, the corresponding frequency scores associated with the subset of tokens, and the plurality of log files that contain the subset of tokens. 6. The method as recited in the claim 1 , wherein the generating of the plurality of nodes comprises: generating a specific node for the plurality of nodes based on the one or more tokens selected from the plurality of tokens and corresponding to one of the plurality of modules. 7. The method as recited in the claim 1 , wherein the generating of the plurality of nodes comprises: when a first token associated with a first node selected from the plurality of nodes has a similarity-distance that is closer to a second node selected from the plurality of nodes, associating the first token from the first node to the second node. 8. The method as recited in the claim 1 , wherein the identifying of the plurality of clusters from the plurality of nodes comprises: selecting an initial number of nodes from the plurality of nodes as a first set of cluster centroids associated with the plurality of clusters; for a first node selected from the plurality of nodes that are not in the first set of cluster centroids, categorizing the first node into one of the plurality of clusters by evaluating corresponding similarity-distances between the first node and the first set of cluster centroids. 9. The method as recited in the claim 8 , further comprising: after the categorizing of the first node into one of the plurality of clusters, calculating a second set of cluster centroids associated with the plurality of clusters; and for a second node selected from the plurality of nodes that are not in the second set of cluster centroids, categorizing the second node into one of the plurality of clusters by evaluating corresponding similarity-distances between the second node and the second set of cluster centroids. 10. A non-transitory computer-readable storage medium, containing a set of instructions which, when executed by a processor, cause the processor to perform a method for automatically diagnosing and resolving operational issues in a cloud environment, the method comprising: generating, by a diagnosis manager, a plurality of pre-processed files based on a plurality of log files, wherein each of the plurality of log files contains operational information related to one or more of the plurality of modules operating in the cloud environment; generating, by the diagnosis manager, a set of weightage matrices based on a plurality of tokens extracted from the plurality of pre-processed files; identifying, by the diagnosis manager, a plurality of clusters by generating a plurality of nodes corresponding to the plurality of modules based on the set of weightage matrices, and identifying the plurality of clusters from the plurality of nodes based on the set of weightage matrices, wherein each of the plurality of clusters includes a subset of tokens selected from the plurality of tokens; determining, by a resolution manager coupled with the diagnosis manager, an operational issue for a specific module selected from the plurality of modules and associated with a specific cluster selected from the plurality of clusters, based on the subset of tokens associated with the specific cluster; and performing, by the resolution manager, a predefined action on the specific module based on the operational issue. 11. The non-transitory computer-readable storage medium of the claim 10 , wherein the generating of the plurality of pre-processed files based on a plurality of log files comprises: identifying a plurality of words from a log file selected from the plurality of log files; extracting one or more tokens from the plurality of words after removing stop-words from and performing stemming on the plurality of words; and storing the one or more tokens in one of the plurality of pre-processed files associated with the log file. 12. The non-transitory computer-readable storage medium of the claim 10 , wherein the generating of the set of weightage matrices based on a plurality of tokens comprises: generating a corresponding token-frequency for each of the plurality of tokens; generating a corresponding inverse-document-frequency for each of the plurality of unique tokens; and generating a corresponding token-weightage for each of the plurality of tokens based on the corresponding token-frequency and the corresponding inverse-document-frequency. 13. The non-transitory computer-readable storage medium of the claim 12 , wherein the generating of the set of weightage matrices based on a plurality of tokens further comprises: constructing the set of weightage matrices based on the plurality of tokens, the corresponding frequency scores associated with the plurality of tokens, and the plurality of log files that contain the plurality of tokens.

Assignees

Vmware Inc

Inventors

Classifications

G06F11/0793
Remedial or corrective actions (recovery from an exception in an instruction pipeline G06F9/3861; by retry G06F11/1402; for recovering from a failure of a protocol instance or entity H04L69/40) · CPC title
G06F11/079
Root cause analysis, i.e. error or fault diagnosis (in a hardware test environment G06F11/22; in a software test environment G06F11/36) · CPC title
H04L41/069
using logs of notifications; Post-processing of notifications · CPC title
G06F11/0778
Dumping, i.e. gathering error/state information after a fault for later diagnosis · CPC title
G06F11/0712Primary
in a virtual computing platform, e.g. logically partitioned systems · CPC title

Patent family

Related publications grouped by family.

View patent family 65018683

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10528407B2 cover?: A method may include generating, by a diagnosis manager, a plurality of pre-processed files based on a plurality of log files containing operational information related to one or more of the plurality of modules operating in the cloud environment. The method may include generating a set of weightage matrices based on a plurality of tokens extracted from the plurality of pre-processed files, and…
Who is the assignee on this patent?: Vmware Inc
What technology area does this patent fall under?: Primary CPC classification G06F11/0712. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jan 07 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Fault management service in a cloud

Clustering events based on extraction rules

Log Mining with Big Data

Neural network based cluster visualization

Compound splitting

System and method for efficiently determining k in data clustering

Frequently asked questions