Method, electronic device, and computer program product for analyzing log file

US11301355B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11301355-B2
Application numberUS-202017002333-A
CountryUS
Kind codeB2
Filing dateAug 25, 2020
Priority dateJul 27, 2020
Publication dateApr 12, 2022
Grant dateApr 12, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments of the present disclosure relate to a method, an electronic device, and a computer program product for analyzing a log file. The method may include: determining, based on a plurality of reference patterns, corresponding patterns for a plurality of log records in the log file. The method may further include: respectively determining the plurality of log records as a plurality of log identifiers corresponding to the corresponding patterns. The method further includes: determining, from the plurality of log identifiers, a log identifier to be analyzed corresponding to a predetermined event. In addition, the method may further include: selecting a target reference log identifier from a plurality of reference log identifiers corresponding to the plurality of reference patterns, wherein a first similarity between the target reference log identifier and the log identifier to be analyzed is higher than a first threshold similarity.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for analyzing a log file, comprising: determining, based at least in part on a plurality of reference patterns, corresponding patterns fora plurality of log records in the log file; respectively determining the plurality of log records as a plurality of log identifiers associated with the corresponding patterns; determining, from the plurality of log identifiers, a first log identifier associated with a first one of the plurality of log records to be analyzed corresponding to a predetermined event; selecting a target reference log identifier from a plurality of reference log identifiers corresponding to the plurality of reference patterns, wherein a first similarity between the target reference log identifier and the first log identifier to be analyzed is higher than a first threshold similarity; acquiring a context associated with the first log identifier, the context comprising one or more additional log identifiers associated with one or more additional ones of the plurality of log records collected at least one of before and after the first log record; performing a diagnosis of one or more system issues associated with the first log record based at least in part on (i) analyzing a first log identifier sequence comprising the first log identifier and the one or more additional log identifiers and (ii) analyzing a predetermined diagnosis strategy of the target reference log identifier; wherein analyzing the first log identifier sequence comprises: generating the first log identifier sequence comprising the first log identifier and the one or more additional log identifiers; generating a second log identifier sequence comprising the target reference log identifier and one or more additional reference context log identifiers associated with the target reference log identifier; and utilizing one or more machine learning models to determine a second similarity between the first log identifier sequence and the second log identifier sequence; and automatically remediating the one or more system issues associated with the first log record utilizing the predetermined diagnosis strategy responsive to determining that the second similarity between the first log identifier sequence and the second log identifier sequence is higher than a second threshold similarity. 2. The method according to claim 1 , further comprising: processing, based at least in part on the predetermined diagnosis strategy of the target reference log identifier for the predetermined event, the first log record corresponding to the first log identifier to be analyzed. 3. The method according to claim 1 , wherein selecting the target reference log identifier comprises: acquiring the one or more additional reference context log identifiers associated with the target reference log identifier; determining a comprehensive similarity based at least in part on the first similarity and the second similarity; and according to a determination that the comprehensive similarity is higher than a third threshold similarity, processing, based at least in part on the predetermined diagnosis strategy of the target reference log identifier for the predetermined event, the first log record corresponding to the first log identifier to be analyzed. 4. The method according to claim 3 , wherein a time interval between the one or more additional log identifiers and the first log identifier to be analyzed is less than a threshold time interval, and a time interval between the one or more additional reference context log identifiers and the target reference log identifier is less than the threshold time interval. 5. The method according to claim 1 , wherein determining the corresponding patterns for the plurality of log records comprises: acquiring the plurality of reference patterns from a reference pattern database; and in response to matching of the first log record in the plurality of log records with a first reference pattern in the plurality of reference patterns, determining the first reference pattern as a pattern of the first log record. 6. The method according to claim 5 , wherein respectively determining the plurality of log records as the plurality of log identifiers comprises: acquiring, from the reference pattern database, a mapping relationship between the plurality of reference patterns and the plurality of reference log identifiers; acquiring, based at least in part on the mapping relationship, a reference log identifier corresponding to the first reference pattern; and determining the reference log identifier corresponding to the first reference pattern as the first log identifier of the first log record. 7. The method according to claim 1 , wherein determining the first log identifier to be analyzed corresponding to the predetermined event comprises: determining, based at least in part on time stamp information of the log file, the first log identifier to be analyzed when the predetermined event occurs. 8. The method according to claim 1 , wherein the predetermined event comprises at least one of the following: a thread crash event; a user report event; and a system indicator abnormity event. 9. An electronic device, comprising: at least one processing unit; and at least one memory coupled to the at least one processing unit and storing machine-executable instructions, wherein the instructions, when executed by the at least one processing unit, cause the device to perform actions comprising: determining, based at least in part on a plurality of reference patterns, corresponding patterns fora plurality of log records in the log file; respectively determining the plurality of log records as a plurality of log identifiers associated with the corresponding patterns; determining, from the plurality of log identifiers, a first log identifier associated with a first one of the plurality of log records to be analyzed corresponding to a predetermined event; selecting a target reference log identifier from a plurality of reference log identifiers corresponding to the plurality of reference patterns, wherein a first similarity between the target reference log identifier and the first log identifier to be analyzed is higher than a first threshold similarity; acquiring a context associated with the first log identifier, the context comprising one or more additional log identifiers associated with one or more additional ones of the plurality of log records collected at least one of before and after the first log record; performing a diagnosis of one or more system issues associated with the first log record based at least in part on (i) analyzing a first log identifier sequence comprising the first log identifier and the one or more additional log identifiers and (ii) analyzing a predetermined diagnosis strategy of the target reference log identifier; wherein analyzing the first log identifier sequence comprises: generating the first log identifier sequence comprising the first log identifier and the one or more additional log identifiers; generating a second log identifier sequence comprising the target reference log identifier and one or more additional reference context log identifiers associated with the target reference log identifier; and utilizing one or more machine learning models to determine a second similarity between the first log identifier sequence and the second log identifier sequence; and automatically remediating the one or more system issues associated with the first log record utilizing the predetermined diagnosis strategy responsive to determining that the second similarity between the first log identifier sequence and the second log identifier sequence is higher than a second threshold similarit

Assignees

Inventors

Classifications

  • Matching criteria, e.g. proximity measures · CPC title

  • where the reporting involves data filtering, e.g. pattern matching, time or event triggered, adaptive or policy-based reporting · CPC title

  • Monitoring arrangements for monitoring the status of the computing system or of the computing system component, e.g. monitoring if the computing system is on, off, available, not available (error or fault processing without redundancy G06F11/0703; error detection or correction by redundancy in data representation G06F11/08; error detection or correction of the data by redundancy in operations G06F11/14; error detection or correction by redundancy in hardware G06F11/16) · CPC title

  • where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems (multiprogramming arrangements G06F9/46; allocation of resources G06F9/50) · CPC title

  • Data logging (G06F11/14, G06F11/2205 take precedence) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11301355B2 cover?
Embodiments of the present disclosure relate to a method, an electronic device, and a computer program product for analyzing a log file. The method may include: determining, based on a plurality of reference patterns, corresponding patterns for a plurality of log records in the log file. The method may further include: respectively determining the plurality of log records as a plurality of log …
Who is the assignee on this patent?
Emc Ip Holding Co Llc
What technology area does this patent fall under?
Primary CPC classification G06F11/3072. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 12 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).