Probability-distribution-based log-file analysis
US-11048608-B2 · Jun 29, 2021 · US
US11226858B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-11226858-B1 |
| Application number | US-202017134060-A |
| Country | US |
| Kind code | B1 |
| Filing date | Dec 24, 2020 |
| Priority date | Dec 24, 2020 |
| Publication date | Jan 18, 2022 |
| Grant date | Jan 18, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A system stores logs representing events that occur in the system based on executable instructions executed by the system, for example, by processes executing within the system or by applications. The system analyzes the logs to determine the root cause of the error or event that resulted in generation of the log. The system clusters logs to determine clusters of logs. The system analyzes logs of each cluster to determine a root cause of errors resulting in logs belonging to the cluster. For any new error log that is received, the system determines the cluster to which the error log belongs and takes action based on the root cause associated with the cluster, for example, sending an alert message or performing automatic remediation.
Opening claim text (preview).
We claim: 1. A computer implemented method for analyzing error logs generated by a system, the method comprising: clustering a set of error logs generated by the system to generate a plurality of clusters of error logs; selecting a cluster of error logs from the plurality of clusters of error logs; for each error log in the selected cluster of error logs and for each term in the error log, determining a cluster characterization score representing a likelihood of the term occurring in the selected cluster of error logs but not in the remaining clusters of error logs; selecting one or more error logs from the cluster of error logs; for each selected error log: determining one or more windows of consecutive terms of the error log; and for each of the one or more windows, determining an aggregate cluster characterization score for terms within the window; selecting a window that maximizes the aggregate cluster characterization score; generating a summary for the cluster of error logs based on the terms of the selected window; and storing the generated summary in association with metadata describing the cluster of logs. 2. The computer implemented method of claim 1 , further comprising: filtering error logs of the cluster of error logs by excluding words having high likelihood of occurrence in the error log but low likelihood of occurrence in the cluster. 3. The computer implemented method of claim 2 , wherein filtering the error logs of the cluster of error logs comprises performing principal component analysis of the error logs to identify terms that occur in the error log but have low likelihood of occurrence in the cluster. 4. The computer implemented method of claim 1 , wherein determining one or more windows of consecutive words of the error log comprises: sliding a window across the error log, wherein sliding the window from a first window of consecutive words of the error log results in a second window of consecutive words of the error log that are overlapping the first window but exclude one or more words of the first window and include one or more words from outside the first window. 5. The computer implemented method of claim 1 , wherein clustering the set of error logs comprises: for each error log, generating a feature vector representing features of the error log; clustering the feature vectors to determine the plurality of clusters. 6. The computer implemented method of claim 5 , wherein the feature vector representing features of the error log is extracted from a hidden layer of a neural network, wherein the neural network receives a representation of the error log as input. 7. The computer implemented method of claim 1 , further comprising: receiving a new error log; determining a cluster of error logs to which the new error log belongs; and performing an action based on the generated summary for the determined cluster of error logs. 8. The computer implemented method of claim 7 , wherein determining a cluster of error logs to which the new error log belongs comprises identifying the cluster of error log that is closest to the new error log based on a distance between a feature vector representation of the new error logs and feature vector representations of error logs of clusters of error logs from the plurality of clusters of error logs. 9. The computer implemented method of claim 7 , further comprising: determining a class of errors based on the generated summary of the cluster of error logs to which the new error log belongs, wherein each class of errors is associated with a set of users; identifying a user associated with the class of errors; and sending an alert to the user, the alert comprising the generated summary of the cluster of error logs to which the new error log belongs. 10. The computer implemented method of claim 7 , further comprising: determining a class of errors based on the generated summary of the cluster of error logs to which the new error log belongs; identifying an automatic remediation action associated with the class of errors; and sending instructions to perform the automatic remediation action. 11. The computer implemented method of claim 7 , further comprising: determining a class of errors based on the generated summary of the cluster of error logs to which the new error log belongs; identifying a user associated with the class of errors; and sending an alert message to the identified user. 12. A non-transitory computer readable storage medium for storing instructions that when executed by a computer processor cause the computer processor to perform steps for performing predictions, the steps comprising: clustering a set of error logs generated by a system to generate a plurality of clusters of error logs; selecting a cluster of error logs from the plurality of clusters of error logs; for each error log in the selected cluster of error logs and for each term in the error log, determining a cluster characterization score representing a likelihood of the term occurring in the selected cluster of error logs but not in the remaining clusters of error logs; selecting one or more error logs from the cluster of error logs; for each selected error log: determining one or more windows of consecutive terms of the error log; and for each of the one or more windows, determining an aggregate cluster characterization score for terms within the window; selecting a window that maximizes the aggregate cluster characterization score; generating a summary for the cluster of error logs based on the terms of the selected window; and storing the generated summary in association with metadata describing the cluster of logs. 13. The non-transitory computer readable storage medium of claim 12 , wherein the instructions further cause the computer processor to perform steps comprising: filtering error logs of the cluster of error logs by excluding words having high likelihood of occurrence in the error log but low likelihood of occurrence in the cluster. 14. The non-transitory computer readable storage medium of claim 12 , wherein the instructions for determining one or more windows of consecutive words of the error log further cause the computer processor to perform steps comprising: sliding a window across the error log, wherein sliding the window from a first window of consecutive words of the error log results in a second window of consecutive words of the error log that are overlapping the first window but exclude one or more words of the first window and include one or more words from outside the first window. 15. The non-transitory computer readable storage medium of claim 12 , wherein the instructions for clustering the set of error logs further cause the computer processor to perform steps comprising: for each error log, generating a feature vector representing features of the error log, wherein the feature vector representing features of the error log is extracted from a hidden layer of a neural network, wherein the neural network receives a representation of the error log as input; clustering the feature vectors to determine the plurality of clusters. 16. The non-transitory computer readable storage medium of claim 12 , wherein the instructions further cause the computer processor to perform steps comprising: receiving a new error log; determining a cluster of error logs to which the new error log belongs; and performing an action based on the generated summary for the determined cluster of error logs. 17. The non-transitory computer readable storage medium of claim 16 , wherei
Root cause analysis, i.e. error or fault diagnosis (in a hardware test environment G06F11/22; in a software test environment G06F11/36) · CPC title
Dumping, i.e. gathering error/state information after a fault for later diagnosis · CPC title
Data logging (G06F11/14, G06F11/2205 take precedence) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.