Systems and methods for identifying process flows from log files and visualizing the flow
US-2018114126-A1 · Apr 26, 2018 · US
US10474513B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10474513-B2 |
| Application number | US-201715416868-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 26, 2017 |
| Priority date | Oct 11, 2016 |
| Publication date | Nov 12, 2019 |
| Grant date | Nov 12, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Some embodiments relate to assigning individual log messages to clusters. An initial cluster assignment may be performed by applying a hash function to one or more non-variable components of the message to generate an initial cluster identifier. Subsequently, clustering may be further refined (e.g., by determining whether to merge clusters based on similarity values). An interface can present a representative message of each cluster and indicate which portions of the message correspond to a variable component. Particular inputs detected at the input corresponding to one of these components can cause other values for the component to be presented. For a given cluster, timestamps of assigned messages can be used to generate a time series, which can facilitate grouping of clusters (with similar or complementary shapes) and/or triggering alerts (with a condition corresponding to a temporal aspect).
Opening claim text (preview).
What is claimed is: 1. A computer-program product tangibly embodied in a non-transitory machine-readable storage medium, including instructions configured to cause one or more data processors to perform actions including: receiving a log message; parsing the log message into a plurality of components, each component of the plurality of components corresponding to a part of the log message; determining, for each of one or more components of the plurality of components, a value for the component from the log message; and determining a cluster identifier based at least in part on: one or more values of the one or more components; and one or more rules; and accessing a data store that associates, for each log message of a plurality of previously processed log messages, an identifier of the log message with an identifier of a corresponding cluster, the association with the corresponding cluster indicating that the log message has one or more content-based characteristics indicative of the corresponding cluster and is assigned to the corresponding cluster, and the corresponding cluster being of a plurality of clusters; querying the data store with the cluster identifier; determining, based on a response to the query, that the cluster identifier corresponds to a new cluster; in response to determining that the cluster identifier corresponds to a new cluster, generating an alert communication that includes information that identifies the cluster. 2. The computer-program product as recited in claim 1 , wherein: the data store further associates, for each log message of the plurality of previously processed log messages, the identifier of the log message with a timestamp of the log message; and determining that the cluster identifier corresponds to a new cluster includes: determining a quantity of log messages, the quantity of log messages representing the received log message and any log messages identified in the data store that are associated with a time stamp corresponding to a recent time period and with the cluster identifier; determining that the quantity of log messages has: increased by at least a pre-defined change threshold; or exceeded an upper pre-defined cluster-assignment threshold. 3. The computer-program product as recited in claim 1 , wherein: the data store further associates, for each log message of the plurality of previously processed log messages, the identifier of the log message with a timestamp of the log message; and the actions further include: determining that another cluster is subsiding, the determining that a quantity of log messages identified in the data store that are associated with a time stamp corresponding to a recent time period and with another cluster identifier has: decreased by at least a pre-defined change threshold; or exceeded a lower pre-defined cluster-assignment threshold; and in response to determining that the other cluster is subsiding, including information that identifies the other cluster in the alert communication or in another alert communication. 4. The computer-program product as recited in claim 3 , wherein the actions further include: identifying a new-cluster temporal characteristic corresponding to one or more time stamps associated with one or more log messages of the new cluster; identifying a subsiding-cluster temporal characteristic corresponding to one or more time stamps associated with one or more log messages of the subsiding cluster; and detecting that a that the new-cluster temporal characteristic is complementary to the subsiding-cluster temporal characteristic; wherein, as a result of the detection, the alert communication includes data to enable input to be provided that corresponds to a request to group the new cluster with the subsiding cluster, and wherein provision of the input triggers statistical data to be generated that corresponds to a combination of a set of log messages assigned to any of the new cluster and the subsiding cluster. 5. The computer-program product as recited in claim 1 , wherein the actions further include: determining, for each component of the plurality of components, whether the component is a variable component or a non-variable component; wherein, when the component is identified as a variable component, a cluster that identifies any messages matching the component is defined such that a value for the component is allowed to differ across log messages in the cluster while sharing a same cluster identity; or wherein, when the component is identified as a non-variable component, a cluster that identifies any messages matching the component is defined such that a value for the component must be the same across log messages in the cluster to share the same cluster identity; wherein each of the one or more components includes a component determined to be a non-variable component. 6. The computer-program product as recited in claim 1 , wherein the actions further include: generating a new-cluster time series for the cluster that is indicative of, for each time bin of a plurality of time bins, a quantity of log messages corresponding to the cluster identifier and having a timestamp within the time bin; generating a different-cluster time series for a different cluster that is indicative of, for each time bin of the plurality of time bins, a quantity of log messages corresponding to a different cluster identifier and having a timestamp within the time bin; presenting an interface that concurrently displays the new-cluster time series and the different-cluster time series. 7. The computer-program product as recited in claim 6 , wherein the new-cluster time series and the different-cluster time series are displayed in a single graph. 8. The computer-program product as recited in claim 1 , wherein the information includes the log message. 9. The computer-program product as recited in claim 1 , wherein the actions further include, for each other cluster of one or more of the plurality of clusters: identifying a temporal characteristic of the other cluster corresponding to one or more time stamps associated with one or more log messages of the other cluster; and retrieving, from the data store, a representative log message for the other cluster, the representative log message being selected from amongst a plurality of log messages associated with an identifier of the other cluster in the data store; wherein the alert communication further includes the temporal characteristic of the other cluster and the representative log message for the other cluster. 10. A computer-implemented method comprising: receiving a log message; parsing the log message into a plurality of components, each component of the plurality of components corresponding to a part of the log message; determining, for each of one or more components of the plurality of components, a value for the component from the log message; and determining a cluster identifier based at least in part on: one or more values of the one or more components; and one or more rules; and accessing a data store that associates, for each log message of a plurality of previously processed log messages, an identifier of the log message with an identifier of a corresponding cluster, the association with the corresponding cluster indicating that the log message has one or more content-based characteristics indicative of the corresponding cluster and is assigned to the corresponding cluster, and the corresponding cluster being of a plurality of clusters; querying the data store with the cluster identifier; determining, based on a response to the query, that the cluster identifier corresponds to a new cluster; in response to deter
Dumping, i.e. gathering error/state information after a fault for later diagnosis · CPC title
Event-based monitoring · CPC title
Threshold · CPC title
Visualisation of programs or trace data · CPC title
Data acquisition and logging (for input to computer G06F3/00) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.