What technology area does this patent fall under?

Primary CPC classification G06F16/285. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jul 16 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Cluster-based processing of unstructured log messages

US10353756B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10353756-B2
Application number	US-201715416571-A
Country	US
Kind code	B2
Filing date	Jan 26, 2017
Priority date	Oct 11, 2016
Publication date	Jul 16, 2019
Grant date	Jul 16, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Some embodiments relate to assigning individual log messages to clusters. An initial cluster assignment may be performed by applying a hash function to one or more non-variable components of the message to generate an initial cluster identifier. Subsequently, clustering may be further refined (e.g., by determining whether to merge clusters based on similarity values). An interface can present a representative message of each cluster and indicate which portions of the message correspond to a variable component. Particular inputs detected at the input corresponding to one of these components can cause other values for the component to be presented. For a given cluster, timestamps of assigned messages can be used to generate a time series, which can facilitate grouping of clusters (with similar or complementary shapes) and/or triggering alerts (with a condition corresponding to a temporal aspect).

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-program product tangibly embodied in a non-transitory machine-readable storage medium, including instructions configured to cause one or more data processors to perform actions including: receiving a plurality of log messages; for each log message of the plurality of log messages: parsing the log message into a plurality of components, each component of the plurality of components corresponding to a part of the log message; determining, for each component of the plurality of components, whether the component is a variable component or a non-variable component; wherein, when the component is identified as a variable component, a cluster that identifies any messages matching the component is defined such that a value for the component is allowed to differ across log messages in the cluster while sharing a same cluster identity; or wherein, when the component is identified as a non-variable component, a cluster that identifies any messages matching the component is defined such that a value for the component must be the same across log messages in the cluster to share the same cluster identity; determining, for each of one or more non-variable components of the plurality of components determined to be a non-variable component, a value for the non-variable component from the log message; and assigning the log message to a cluster of a set of clusters based at least in part on: one or more values of the one or more non-variable components; and one or more rules; and storing a message identifier of the log message in association with a cluster identifier corresponding to the cluster. 2. The computer-program product as recited in claim 1 , wherein assigning the log message to the cluster includes: defining a skeleton of the log message based on values for the one or more non-variable components, wherein a value for each of the one or more non-variable components is not included in the skeleton; and using a deterministic function to transform the skeleton of the log message into the cluster identifier, the one or more rules including the deterministic function. 3. The computer-program product as recited in claim 1 , wherein parsing the log message into a plurality of components includes applying one or more grammar rules. 4. The computer-program product as recited in claim 1 , wherein the actions further include: receiving a query for log data; identifying a set of message identifiers that correspond to the query; identifying a subset of the set of clusters based on the cluster identifiers stored in association with the message identifiers, wherein, for each cluster in the subset, at least some messages of the set of message identifiers is associated with a cluster identifier corresponding to the cluster; and generating a response to the query, the response including a representation of each cluster in the subset. 5. The computer-program product as recited in claim 4 , wherein the message identifiers are stored in association with the cluster identifiers prior to receiving the query. 6. The computer-program product as recited in claim 4 , wherein, for each log message of the plurality of log messages, the log message is assigned to the cluster at an ingest time in response to receiving the log message from a source, and wherein the ingest time is prior to receiving the query. 7. The computer-program product as recited in claim 4 , wherein the actions further include, for each cluster in the subset of the set of clusters: identifying, from amongst the at least some messages associated with the cluster identifier corresponding to the cluster, one or more representative log messages of the cluster, the one or more representative log messages being an incomplete subset of the at least some messages associated with the cluster identifier, wherein the representation of the cluster includes the one or more representative log messages. 8. The computer-program product as recited in claim 4 , wherein the actions further include, for each cluster in the set of clusters: identifying, from amongst the at least some messages associated with the cluster identifier corresponding to the cluster, one or more representative log messages of the cluster, the one or more representative log messages being an incomplete subset of the at least some messages associated with the cluster identifier; and performing a comparison processing to determine a similarity value representing a similarity between one or more representative log messages of a first cluster of the subset and one or more representative log messages of a second cluster of the subset; and determining, based on the comparison processing, whether to merge the first cluster with the second cluster in the subset. 9. The computer-program product as recited in claim 4 , wherein, for each of at least some of the plurality of log messages, assigning the log message to the cluster includes: using a deterministic function to transform the one or more values of the one or more non-variable components into a preliminary cluster identifier at an ingest time in response to receiving the log message from a source, the one or more rules including the deterministic function; storing, prior to receiving the query, the message identifier of the log message in association with the preliminary cluster identifier, the preliminary cluster identifier; and subsequent to receiving the query, using a merging rule that merges multiple clusters together to assign the log message to the cluster, the one or more rules including the deterministic function. 10. A computer-implemented method comprising: receiving a plurality of log messages; for each log message of the plurality of log messages: parsing the log message into a plurality of components, each component of the plurality of components corresponding to a part of the log message; determining, for each component of the plurality of components, whether the component is a variable component or a non-variable component; wherein, when the component is identified as a variable component, a cluster that identifies any messages matching the component is defined such that a value for the component is allowed to differ across log messages in the cluster while sharing a same cluster identity; or wherein, when the component is identified as a non-variable component, a cluster that identifies any messages matching the component is defined such that a value for the component must be the same across log messages in the cluster to share the same cluster identity; determining, for each of one or more non-variable components of the plurality of components determined to be a non-variable component, a value for the non-variable component from the log message; and assigning the log message to a cluster of a set of clusters based at least in part on: one or more values of the one or more non-variable components; and one or more rules; and storing a message identifier of the log message in association with a cluster identifier corresponding to the cluster. 11. The computer-implemented method as recited in claim 10 , wherein assigning the log message to the cluster includes: defining a skeleton of the log message based on values for the one or more non-variable components, wherein a value for each of the one or more non-variable components is not included in the skeleton; and using a deterministic function to transform the skeleton of the log message into the cluster identifier, the one or more rules including the deterministic function. 12. The computer-implemented method as recited in claim 10 , wherein parsing the log message into a plurality of compone

Assignees

Oracle Int Corp

Inventors

Classifications

G06F11/0778
Dumping, i.e. gathering error/state information after a fault for later diagnosis · CPC title
G06F2201/81
Threshold · CPC title
G06F17/40
Data acquisition and logging (for input to computer G06F3/00) · CPC title
G06F11/3072
where the reporting involves data filtering, e.g. pattern matching, time or event triggered, adaptive or policy-based reporting · CPC title
G06F2201/86
Event-based monitoring · CPC title

Patent family

Related publications grouped by family.

View patent family 61828945

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10353756B2 cover?: Some embodiments relate to assigning individual log messages to clusters. An initial cluster assignment may be performed by applying a hash function to one or more non-variable components of the message to generate an initial cluster identifier. Subsequently, clustering may be further refined (e.g., by determining whether to merge clusters based on similarity values). An interface can present a…
Who is the assignee on this patent?: Oracle Int Corp
What technology area does this patent fall under?: Primary CPC classification G06F16/285. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jul 16 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).