Cluster-based processing of unstructured log messages
US-2018101423-A1 · Apr 12, 2018 · US
US10338977B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10338977-B2 |
| Application number | US-201715416887-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 26, 2017 |
| Priority date | Oct 11, 2016 |
| Publication date | Jul 2, 2019 |
| Grant date | Jul 2, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Some embodiments relate to assigning individual log messages to clusters. An initial cluster assignment may be performed by applying a hash function to one or more non-variable components of the message to generate an initial cluster identifier. Subsequently, clustering may be further refined (e.g., by determining whether to merge clusters based on similarity values). An interface can present a representative message of each cluster and indicate which portions of the message correspond to a variable component. Particular inputs detected at the input corresponding to one of these components can cause other values for the component to be presented. For a given cluster, timestamps of assigned messages can be used to generate a time series, which can facilitate grouping of clusters (with similar or complementary shapes) and/or triggering alerts (with a condition corresponding to a temporal aspect).
Opening claim text (preview).
What is claimed is: 1. A computer-program product tangibly embodied in a non-transitory machine-readable storage medium, including instructions configured to cause one or more data processors to perform actions including: accessing a data store that associates, for each machine-generated data record of a set of machine-generated data records, an identifier of the machine-generated data record with one or more value identifiers, each value identifier of the one or more value identifiers representing one or more values included within the machine-generated data record; selecting a representative machine-generated data record from amongst the set of machine-generated data records; identifying, for each component of a plurality of components of the representative machine-generated data record, a value for the component that is included in a part of the representative machine-generated data record that corresponds to the component; and determining, for each component of a plurality of components, that the component corresponds to a variable component, thereby indicating that the set of machine-generated data records includes one or more other values for the component; facilitating a presentation that includes, for each component of the plurality of components: the value for the component; and one or more interactive options configured to, upon detecting input of a defined type corresponding to the value, identify at least one of the one or more other values for the component, wherein each of the at least one of the one or more other values is included in a part of another machine-generated data record in the set of machine-generated data records. 2. The computer-program product as recited in claim 1 , wherein: each of the one or more interactive options is presented so as to be visually associated with a particular variable component of the plurality of components; and the presentation is configured such that detection of an input, via an interactive option of the one or more interactive options, of the defined type corresponding to a value of the at least some of the values triggers the presentation to update to include the at least one of the one or more other values for the component. 3. The computer-program product as recited in claim 1 , wherein the one or more interactive options are further configured to, upon detecting the input of the defined type corresponding to the value, identify, for each other value of the one or more other values, an indication of a quantity of machine-generated data records of the set of machine-generated data records that include the other value. 4. The computer-program product as recited in claim 1 , wherein the presentation includes the representative machine-generated data record, wherein the presentation differentially identifies the values for the plurality of components included in the representative machine-generated data record. 5. The computer-program product as recited in claim 1 , wherein the actions further include: detecting, via an interactive option of the one or more interactive options, an input of the defined type that corresponds to an identification of a particular value of the one or more other values for a particular variable component; identifying a subset of the set of machine-generated data records that include the particular value for the particular variable component; selecting a new representative machine-generated data record from amongst the set of machine-generated data records, the new representative machine-generated data record including the particular value; identifying, for each other component of the plurality of components, a new-record value for the other component that is included in a part of the new representative machine-generated data record; determining, for each other component of the plurality of components, whether the other component corresponds to a variable component, thereby indicating that the subset of the set of machine-generated data records includes one or more other values different than the new-record value for the other component; and updating the presentation to include, for each other component of the plurality of components: the new-record value; and when the other component has been determined to be a variable component, the one or more interactive options configured to, upon detecting input of the defined type corresponding to the new-record value, identify at least one of one or more alternative values included in a part of another machine-generated data record in the subset of the set of machine-generated data records. 6. The computer-program product as recited in claim 1 , wherein the actions further include: identifying, for each component of one or more other components of the representative machine-generated data record, a value for the component that is included in a part of the representative machine-generated data records that corresponds to the component; and determining, for each component of the one or more other components, that the component corresponds to a non-variable component, thereby indicating that the value for the component is the same across machine-generated data records in the set of machine-generated data records; wherein the presentation further includes, for each component of the one or more other components, the value for the component, wherein the presentation differentially represents each value for the plurality of components determined to correspond to a variable component with respect to each value for the one or more other components determined to correspond to a non-variable component. 7. The computer-program product as recited in claim 1 , wherein the actions further include: assessing the set of machine-generated data records to detect whether at least one machine-generated data record of the set of machine-generated data records includes a value for a component of the plurality of components that matches a value on a prioritized predefined list; wherein the representative machine-generated data record is selected from amongst the at least one machine-generated data record when it is detected that at least one machine-generated data record of the set of machine-generated data records includes a value for a component of the plurality of components that matches a value on the prioritized predefined list. 8. The computer-program product as recited in claim 1 , wherein the actions further include: assessing the set of machine-generated data records to detect, for a component of the plurality of components and for each value of multiple values, a quantity of machine-generated data records that include the value for the component, wherein the representative machine-generated data record is selected based on the quantities. 9. The computer-program product as recited in claim 1 , wherein: each machine-generated data record of the set of machine-generated data records includes to a log message assigned to a particular cluster, wherein the actions further include assigning each log message included in the set of machine-generated data records to the particular cluster by: parsing the log message into a set of values, each value corresponding to a component; determining, for each value of the set of values, whether the value corresponds to: a variable component indicating that the set of machine-generated data records includes one or more values for the component; or a non-variable component indicating that the value for the component is the same across machine-generated data records in the set of machine-generated data records; and identifying the particular cluster based on a subset of the set of values, each value in the subset determined to correspond to a non-variable component.
Dumping, i.e. gathering error/state information after a fault for later diagnosis · CPC title
Event-based monitoring · CPC title
Data acquisition and logging (for input to computer G06F3/00) · CPC title
Visualisation of programs or trace data · CPC title
where the reporting involves data filtering, e.g. pattern matching, time or event triggered, adaptive or policy-based reporting · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.