Systems and methods for diagnosing problems from error logs using natural language processing

US11568134B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11568134-B2
Application numberUS-202016989736-A
CountryUS
Kind codeB2
Filing dateAug 10, 2020
Priority dateJun 30, 2017
Publication dateJan 31, 2023
Grant dateJan 31, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed is a solution for diagnosing problems from logs used in an application development environment. A random sample of log statements is collected. The log statements can be completely unstructured and/or do not conform to any natural language. The log statements are tagged with predefined classifications. A natural language processing (NLP) classifier model is trained utilizing the log statements tagged with the predefined classification. New log statements can be classified into the plurality of predefined classifications utilizing the trained NLP classifier model. From the log statements thus classified, statements having a problem classification can be identified and presented through a dashboard running in a browser. Outputs from the trained NLP classifier model can be provided as input to another trained model for automatically and quickly identifying a type of problem associated with the statements, eliminating a need to manually sift through tens or hundreds of thousands of lines of logs.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for diagnosing problems from logs, the system comprising: a processor; a non-transitory computer-readable medium comprising instructions for: receiving new log statements; classifying the new log statements into at least one of a plurality of classifications based on an NLP feature associated with the log statements utilizing a NLP classifier model, wherein the plurality of predefined classifications include a problem classification and wherein the NLP classifier model was trained by: utilizing log statements tagged with the at least one of the plurality of classifications, wherein the log statements comprise training data created from logs used in an enterprise computing environment including statements from a plurality of sources in the enterprise computing environment, and the training data was created by sampling the statements in the logs and tagging the log statements in the sample with the at least one of the plurality of classifications; identifying classified new log statements as log statements having the problem classification; and presenting the identified classified new log statements having the problem classification through a user interface. 2. The system of claim 1 , wherein the statements contained in the logs from the plurality of sources comprise unstructured data. 3. The system of claim 1 , wherein the sampling of statements is random. 4. The system of claim 1 , wherein the instructions are further for performing the tagging of the log statements of the sample. 5. The system of claim 1 , wherein the plurality of classifications are domain specific. 6. The system of claim 1 , wherein the NLP classifier model is one of a plurality of trained models. 7. The system of claim 6 , wherein outputs from the trained NLP classifier model are provided as input to another trained model of the plurality of trained models for identifying a type of problem associated with the statements having the problem classification. 8. A method for diagnosing problems from logs, the method comprising: receiving new log statements; classifying the new log statements into at least one of a plurality of classifications based on an NLP feature associated with the log statements utilizing a NLP classifier model, wherein the plurality of predefined classifications include a problem classification and wherein the NLP classifier model was trained by: utilizing log statements tagged with the at least one of the plurality of classifications, wherein the log statements comprise training data created from logs used in an enterprise computing environment including statements from a plurality of sources in the enterprise computing environment, and the training data was created by sampling the statements in the logs and tagging the log statements in the sample with the at least one of the plurality of classifications; identifying classified new log statements as log statements having the problem classification; and presenting the identified classified new log statements having the problem classification through a user interface. 9. The method of claim 8 , wherein the statements contained in the logs from the plurality of sources comprise unstructured data. 10. The method of claim 8 , wherein the sampling of statements is random. 11. The method of claim 8 , further comprising performing the tagging of the log statements of the sample. 12. The method of claim 8 , wherein the plurality of classifications are domain specific. 13. The method of claim 8 , wherein the NLP classifier model is one of a plurality of trained models. 14. The method of claim 13 , wherein outputs from the trained NLP classifier model are provided as input to another trained model of the plurality of trained models for identifying a type of problem associated with the statements having the problem classification. 15. A non-transitory computer readable medium, comprising instructions for: receiving new log statements; classifying the new log statements into at least one of a plurality of classifications based on an NLP feature associated with the log statements utilizing a NLP classifier model, wherein the plurality of predefined classifications include a problem classification and wherein the NLP classifier model was trained by: utilizing log statements tagged with the at least one of the plurality of classifications, wherein the log statements comprise training data created from logs used in an enterprise computing environment including statements from a plurality of sources in the enterprise computing environment, and the training data was created by sampling the statements in the logs and tagging the log statements in the sample with the at least one of the plurality of classifications; identifying classified new log statements as log statements having the problem classification; and presenting the identified classified new log statements having the problem classification through a user interface. 16. The non-transitory computer readable medium of claim 15 , wherein the statements contained in the logs from the plurality of sources comprise unstructured data. 17. The non-transitory computer readable medium of claim 15 , wherein the sampling of statements is random. 18. The non-transitory computer readable medium of claim 15 , wherein the instructions are further for performing the tagging of the log statements of the sample. 19. The non-transitory computer readable medium of claim 15 , wherein the plurality of classifications are domain specific. 20. The non-transitory computer readable medium of claim 15 , wherein the NLP classifier model is one of a plurality of trained models. 21. The non-transitory computer readable medium of claim 20 , wherein outputs from the trained NLP classifier model are provided as input to another trained model of the plurality of trained models for identifying a type of problem associated with the statements having the problem classification.

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11568134B2 cover?
Disclosed is a solution for diagnosing problems from logs used in an application development environment. A random sample of log statements is collected. The log statements can be completely unstructured and/or do not conform to any natural language. The log statements are tagged with predefined classifications. A natural language processing (NLP) classifier model is trained utilizing the log s…
Who is the assignee on this patent?
Open Text Corp
What technology area does this patent fall under?
Primary CPC classification G06F40/205. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 31 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).