Systems and methods for diagnosing problems from error logs using natural language processing
US-10776577-B2 · Sep 15, 2020 · US
US11568134B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11568134-B2 |
| Application number | US-202016989736-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 10, 2020 |
| Priority date | Jun 30, 2017 |
| Publication date | Jan 31, 2023 |
| Grant date | Jan 31, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Disclosed is a solution for diagnosing problems from logs used in an application development environment. A random sample of log statements is collected. The log statements can be completely unstructured and/or do not conform to any natural language. The log statements are tagged with predefined classifications. A natural language processing (NLP) classifier model is trained utilizing the log statements tagged with the predefined classification. New log statements can be classified into the plurality of predefined classifications utilizing the trained NLP classifier model. From the log statements thus classified, statements having a problem classification can be identified and presented through a dashboard running in a browser. Outputs from the trained NLP classifier model can be provided as input to another trained model for automatically and quickly identifying a type of problem associated with the statements, eliminating a need to manually sift through tens or hundreds of thousands of lines of logs.
Opening claim text (preview).
What is claimed is: 1. A system for diagnosing problems from logs, the system comprising: a processor; a non-transitory computer-readable medium comprising instructions for: receiving new log statements; classifying the new log statements into at least one of a plurality of classifications based on an NLP feature associated with the log statements utilizing a NLP classifier model, wherein the plurality of predefined classifications include a problem classification and wherein the NLP classifier model was trained by: utilizing log statements tagged with the at least one of the plurality of classifications, wherein the log statements comprise training data created from logs used in an enterprise computing environment including statements from a plurality of sources in the enterprise computing environment, and the training data was created by sampling the statements in the logs and tagging the log statements in the sample with the at least one of the plurality of classifications; identifying classified new log statements as log statements having the problem classification; and presenting the identified classified new log statements having the problem classification through a user interface. 2. The system of claim 1 , wherein the statements contained in the logs from the plurality of sources comprise unstructured data. 3. The system of claim 1 , wherein the sampling of statements is random. 4. The system of claim 1 , wherein the instructions are further for performing the tagging of the log statements of the sample. 5. The system of claim 1 , wherein the plurality of classifications are domain specific. 6. The system of claim 1 , wherein the NLP classifier model is one of a plurality of trained models. 7. The system of claim 6 , wherein outputs from the trained NLP classifier model are provided as input to another trained model of the plurality of trained models for identifying a type of problem associated with the statements having the problem classification. 8. A method for diagnosing problems from logs, the method comprising: receiving new log statements; classifying the new log statements into at least one of a plurality of classifications based on an NLP feature associated with the log statements utilizing a NLP classifier model, wherein the plurality of predefined classifications include a problem classification and wherein the NLP classifier model was trained by: utilizing log statements tagged with the at least one of the plurality of classifications, wherein the log statements comprise training data created from logs used in an enterprise computing environment including statements from a plurality of sources in the enterprise computing environment, and the training data was created by sampling the statements in the logs and tagging the log statements in the sample with the at least one of the plurality of classifications; identifying classified new log statements as log statements having the problem classification; and presenting the identified classified new log statements having the problem classification through a user interface. 9. The method of claim 8 , wherein the statements contained in the logs from the plurality of sources comprise unstructured data. 10. The method of claim 8 , wherein the sampling of statements is random. 11. The method of claim 8 , further comprising performing the tagging of the log statements of the sample. 12. The method of claim 8 , wherein the plurality of classifications are domain specific. 13. The method of claim 8 , wherein the NLP classifier model is one of a plurality of trained models. 14. The method of claim 13 , wherein outputs from the trained NLP classifier model are provided as input to another trained model of the plurality of trained models for identifying a type of problem associated with the statements having the problem classification. 15. A non-transitory computer readable medium, comprising instructions for: receiving new log statements; classifying the new log statements into at least one of a plurality of classifications based on an NLP feature associated with the log statements utilizing a NLP classifier model, wherein the plurality of predefined classifications include a problem classification and wherein the NLP classifier model was trained by: utilizing log statements tagged with the at least one of the plurality of classifications, wherein the log statements comprise training data created from logs used in an enterprise computing environment including statements from a plurality of sources in the enterprise computing environment, and the training data was created by sampling the statements in the logs and tagging the log statements in the sample with the at least one of the plurality of classifications; identifying classified new log statements as log statements having the problem classification; and presenting the identified classified new log statements having the problem classification through a user interface. 16. The non-transitory computer readable medium of claim 15 , wherein the statements contained in the logs from the plurality of sources comprise unstructured data. 17. The non-transitory computer readable medium of claim 15 , wherein the sampling of statements is random. 18. The non-transitory computer readable medium of claim 15 , wherein the instructions are further for performing the tagging of the log statements of the sample. 19. The non-transitory computer readable medium of claim 15 , wherein the plurality of classifications are domain specific. 20. The non-transitory computer readable medium of claim 15 , wherein the NLP classifier model is one of a plurality of trained models. 21. The non-transitory computer readable medium of claim 20 , wherein outputs from the trained NLP classifier model are provided as input to another trained model of the plurality of trained models for identifying a type of problem associated with the statements having the problem classification.
Semantic analysis · CPC title
Software maintenance or management · CPC title
Parsing · CPC title
Natural language query formulation · CPC title
Machine learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.