Systems and methods for troubleshooting errors within computing tasks using models of log files

US9552249B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-9552249-B1
Application numberUS-201414519107-A
CountryUS
Kind codeB1
Filing dateOct 20, 2014
Priority dateOct 20, 2014
Publication dateJan 24, 2017
Grant dateJan 24, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The disclosed computer-implemented method for troubleshooting computing tasks using log files may include (1) identifying multiple log files generated during successful executions of a computing task, (2) identifying an anomalous log file generated during an anomalous execution of the computing task, (3) creating a model of a successful log file for the computing task by (a) identifying invariants that represent matching sequences found in the same location within at least two successful log files and (b) storing each invariant in a node within the model, and (4) traversing, sequentially through the anomalous log file, matching sequences within the anomalous log file with nodes within the model until identifying at least one discrepancy between the anomalous log file and the model. Various other methods, systems, and computer-readable media are also disclosed.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for troubleshooting errors within computing tasks using models of log files, at least a portion of the method being performed by a computing device comprising at least one processor, the method comprising: identifying a plurality of log files generated during successful executions of a computing task; identifying an anomalous log file generated during an anomalous execution of the computing task; creating a model of a successful log file for the computing task by: identifying invariants that represent matching sequences found in a same location within at least two successful log files; storing each invariant in a node within the model; traversing, sequentially through the anomalous log file, matching sequences within the anomalous log file with nodes within the model until identifying at least one discrepancy between the anomalous log file and the model. 2. The method of claim 1 , wherein identifying the anomalous log file comprises receiving the anomalous log file from a user that is troubleshooting the computing task. 3. The method of claim 1 , wherein creating the model comprises removing information irrelevant to the performance of the computing task from the successful log files before identifying the invariants within the successful log files. 4. The method of claim 1 , wherein identifying the invariants comprises: identifying a plurality of successful log files that each comprise a plurality of functions, each function comprising at least one character; aligning each function within the plurality of successful log files; traversing over each character within the aligned functions to identify matching sequences of characters. 5. The method of claim 4 , wherein aligning each function within the plurality of successful log files comprises: representing each function within the plurality of successful log files with a separate symbol such that each successful log file is represented by a string of symbols; aligning the strings of symbols using a sequence alignment algorithm; aligning each function within the plurality of successful log files based on the aligned strings. 6. The method of claim 4 , wherein creating the model of the successful log file comprises: identifying, from within the plurality of successful log files, two successful log files that are most similar; creating an initial model based on the two most similar successful log files; refining the initial model based on successful log files other than the two most similar successful log files. 7. The method of claim 1 , wherein: the model of the successful log file comprises a trie with multiple branches; matching sequences within the anomalous log file with nodes within the model comprises identifying branches within the trie that correspond to sequences within the anomalous log file. 8. The method of claim 1 , wherein identifying the discrepancy between the anomalous log file and the model comprises at least one of: determining that a sequence in a particular location within the anomalous log file differs from a sequence in the same location within the model; determining that the anomalous log file contains an additional sequence not included in the model; determining that the anomalous log file terminates before the model terminates. 9. The method of claim 1 , further comprising replacing known variants within the successful log files with predetermined strings prior to identifying the invariants. 10. The method of claim 1 , further comprising troubleshooting the anomalous log file based on at least one of: error messages within the anomalous log file; an expected execution time of the computing task; edit-distances between sequences within the model and sequences within the anomalous log file. 11. A system for troubleshooting errors within computing tasks using models of log files, the system comprising: an identification module, stored in memory, that identifies: a plurality of log files generated during successful executions of a computing task; an anomalous log file generated during an anomalous execution of the computing task; a creation module, stored in memory, that creates a model of a successful log file for the computing task by: identifying invariants that represent matching sequences found in a same location within at least two successful log files; storing each invariant in a node within the model; a traversing module, stored in memory, that traverses, sequentially through the anomalous log file, matching sequences within the anomalous log file with nodes within the model until identifying at least one discrepancy between the anomalous log file and the model; at least one processor that executes the identification module, the creation module, and the traversing module. 12. The system of claim 11 , wherein the creation module creates the model by removing information irrelevant to the performance of the computing task from the successful log files before identifying the invariants within the successful log files. 13. The system of claim 11 , wherein the identification module identifies the invariants by: identifying a plurality of successful log files that each comprise a plurality of functions, each function comprising at least one character; aligning each function within the plurality of successful log files; traversing over each character within the aligned functions to identify matching sequences of characters. 14. The system of claim 13 , wherein the identification module aligns each function within the plurality of successful log files by: representing each function within the plurality of successful log files with a separate symbol such that each successful log file is represented by a string of symbols; aligning the strings of symbols using a sequence alignment algorithm; aligning each function within the plurality of successful log files based on the aligned strings. 15. The system of claim 13 , wherein the creation module creates the model of the successful log file by: identifying, from within the plurality of successful log files, two successful log files that are most similar; creating an initial model based on the two most similar successful log files; refining the initial model based on successful log files other than the two most similar successful log files. 16. The system of claim 11 , wherein: the model of the successful log file comprises a trie with multiple branches; the traversing module matches sequences within the anomalous log file with nodes within the model by identifying branches within the trie that correspond to sequences within the anomalous log file. 17. The system of claim 11 , wherein the identification module identifies the discrepancy between the anomalous log file and the model by at least one of: determining that a sequence in a particular location within the anomalous log file differs from a sequence in the same location within the model; determining that the anomalous log file contains an additional sequence not included in the model; determining that the anomalous log file terminates before the model terminates. 18. The system of claim 11 , further comprising a replacement module that replaces known variants within the successful log files with predetermined strings prior to identifying the invariants. 19. The system of claim 11 , further comprising a troubleshooting module that troubleshoots the anomalous log file based on at least one of: error messages within the anomalous log file;

Assignees

Inventors

Classifications

  • Data logging (G06F11/14, G06F11/2205 take precedence) · CPC title

  • Error or fault reporting or storing · CPC title

  • Readable error formats, e.g. cross-platform generic formats, human understandable formats · CPC title

  • Performance evaluation by modeling · CPC title

  • G06F11/079Primary

    Root cause analysis, i.e. error or fault diagnosis (in a hardware test environment G06F11/22; in a software test environment G06F11/36) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9552249B1 cover?
The disclosed computer-implemented method for troubleshooting computing tasks using log files may include (1) identifying multiple log files generated during successful executions of a computing task, (2) identifying an anomalous log file generated during an anomalous execution of the computing task, (3) creating a model of a successful log file for the computing task by (a) identifying invaria…
Who is the assignee on this patent?
Symantec Corp, Veritas Tech
What technology area does this patent fall under?
Primary CPC classification G06F11/3476. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 24 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).