What technology area does this patent fall under?

Primary CPC classification G06F16/35. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Sep 23 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Systems and methods for generating a system log parser

US12423170B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12423170-B2
Application number	US-202217578692-A
Country	US
Kind code	B2
Filing date	Jan 19, 2022
Priority date	Jan 19, 2022
Publication date	Sep 23, 2025
Grant date	Sep 23, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure provides systems and methods for generation of parsing scripts or rules for unstructured or semi-structured system log messages, including systems and methods for identifying and clustering of same or substantially similar system log messages using machine learning. Patterns indicative of the same or substantially similar types system log messages can be generated based on the clustering of the system log messages and calculated similarities of attributes or distances between common features/fields of the system log messages, with the results of the clustering presented for analysis and development or adjustment of parsing scripts.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for generation of parsing scripts or rules for system logs, comprising: an event management center including at least one processor and memory configured to: receive a plurality of system log messages in real-time from a plurality of monitored devices, the plurality of system log messages including a plurality of different types of unstructured or semi-structured system log messages; determine whether one or more parsing scripts or rules are available to parse or normalize at least some of the plurality of system log messages; and if one or more parsing scripts or rules are available to parse or normalize at least some of the plurality of system log messages, apply the one or more parsing scripts or rules thereto; and if one or more of the plurality of system log messages are in an unrecognized format or a parsing script or rule is not available to parse or normalize the system log messages: submit the one or more of the plurality of system log messages to at least one clustering model stored in a memory of or accessible by the at least one processor to form clusters of system log messages of the one or more of the plurality system log messages that are of substantially a same type, wherein the system log messages of a cluster are separated by a distance based on differences between characters in each of the system log messages of the cluster, search for patterns within the system log messages of the cluster, remove one or more variable attributes from each of the plurality of system log messages, determine, via the at least one clustering model, a probability that the patterns between each of the system log messages of the cluster indicates a high confidence or low confidence of relationship between the system log messages, remove one or more of the plurality of the system log messages in each of the clusters based on an indication of low confidence, generate a pattern template for each of the plurality of system log messages within each of the clusters, and generate a new parsing script based on the pattern template. 2. The system of claim 1 , wherein the event management center comprises a data center of a managed security service provider. 3. The system of claim 1 , wherein the event management center comprises a network server. 4. The system of claim 1 , wherein the at least one clustering model is further configured to identify patterns within at least two of the plurality of system log messages of one of the clusters and develop a vocabulary of most commonly used attributes thereof. 5. The system of claim 4 , wherein the at least one clustering model is further configured to determine a distance between each of the plurality of system log messages within each cluster based upon a number of non-varying attributes present in each of the plurality of system log messages and clustering the each of the plurality of system log messages based upon a selected distance. 6. The system of claim 1 , wherein the event management center is further configured to apply one or more training data sets to the at least one clustering model to form the clusters, the one or more training data sets including historically identified features or attributes indicative of identifiable ones of the plurality of system log messages received by the event management center. 7. The system of claim 1 , wherein the at least one clustering model is further configured to group the system log messages into the clusters based upon two or more selected parameters including a selected number of messages, a size of a vocabulary of commonly used attributes, a selected attribute length, a maximum distance between system log messages, and a minimum number of system log messages per cluster. 8. A method of generating parsing scripts or rules for security log data, comprising: receiving security log data in real-time comprising a plurality of different types of unstructured or semi-structured system log messages from a plurality of monitored devices; applying a probabilistic model to identify system log messages having a series of common attributes indicating the system log messages are of a same or substantially same type; clustering system log messages of the same or substantially same type to form clusters; determining a confidence level of matching for each of the system log messages of each of the clusters with other system log messages in a corresponding cluster; removing system log messages with a level of confidence below a selected threshold from the corresponding cluster; and generating one or more regex pattern scripts configured to match an identified type of system log messages based on the clustered system log messages and corresponding confidence levels of each of the clustered system log messages. 9. The method of claim 8 , further comprising generating training data sets for training the probabilistic model. 10. The method of claim 9 , further comprising updating the training data sets with security log data processed by the probabilistic model. 11. The method of claim 8 , further comprising determining whether one or more parsing scripts or rules are available for parsing and/or normalization of the system log messages, and if one or more parsing scripts or rules are available to parse or normalize unstructured data in one or more of the system log messages, applying at least one selected parsing script or rule to the unstructured data for parsing or normalization of the unstructured data into a normalized log. 12. The method of claim 8 , further comprising applying historical patterns to the system log messages.

Assignees

Secureworks Corp

Inventors

Classifications

G06N7/01
Probabilistic graphical models, e.g. probabilistic networks · CPC title
G06F18/214
Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
G06F16/35Primary
Clustering; Classification · CPC title
G06F16/3346
using probabilistic model · CPC title
G06F11/0787
Storage of error reports, e.g. persistent data storage, storage using memory protection · CPC title

Patent family

Related publications grouped by family.

View patent family 87161924

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12423170B2 cover?: The present disclosure provides systems and methods for generation of parsing scripts or rules for unstructured or semi-structured system log messages, including systems and methods for identifying and clustering of same or substantially similar system log messages using machine learning. Patterns indicative of the same or substantially similar types system log messages can be generated based o…
Who is the assignee on this patent?: Secureworks Corp
What technology area does this patent fall under?: Primary CPC classification G06F16/35. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Sep 23 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).