What technology area does this patent fall under?

Primary CPC classification G06F16/90344. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue May 09 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Classification by optimizing regular expression rules

US11645337B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11645337-B2
Application number	US-202016879813-A
Country	US
Kind code	B2
Filing date	May 21, 2020
Priority date	May 21, 2020
Publication date	May 9, 2023
Grant date	May 9, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer-based system and method for optimizing execution of regular expression rules, each including one or more sub-rules, may include: testing, by a processor, the sub-rules against a data sample; measuring, by a processor and based on the testing, the probability for every sub-rule that it appears in the data sample, and the processing time of each sub-rule; and finding, by a processor, an order of execution of at least a subset of the sub-rules to shorten the total execution time of validating the regular expression rules, based to the probability and the execution time of each of the sub-rules.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer implemented method for optimizing execution of regular expression rules, each comprising a plurality of sub-rules, the method comprising: testing, by a processor, the sub-rules in a data sample that includes a plurality of regular expressions; measuring, by a processor and based on the testing, a probability for every sub-rule that it appears in the data sample, and the execution time of each sub-rule; and finding, by a processor, an order of execution of at least a subset of the sub-rules to shorten the total execution time of validating the regular expression rules, based on a probability of co-occurrence of at least one sub-rule among different regular expressions and the execution time of the sub-rules, wherein a sub-rule that requires short processing time or lower processing power is prioritized over a sub-rule that requires longer processing time or higher processing power, a sub-rule that rarely appears or matches in the data sample is prioritized over a sub-rule that appears or matches in the data sample more frequently, a sub-rule that appears in a regular expression rule that requires longer processing time or higher processing power is prioritized over a sub-rule that appears in a regular expression rule that requires shorter processing time or lower processing power, and a sub—rule that repeats in many regular expression rules is prioritized over a sub-rule that repeats in fewer regular expression rules. 2. The method of claim 1 , wherein finding the order of execution of the subset of sub-rules is performed according to frequency of the subset of sub-rules, wherein the frequency of a sub-rule equals the number of regular expression rules that include the sub-rule. 3. The method of claim 2 , comprising: measuring, by a processor and based on the testing, execution time of the regular expression rules, wherein finding the order of execution of the subset of sub-rules is performed according to the execution time of the regular expression rules. 4. The method of claim 1 , wherein finding the order of executing the subset of sub-rules is performed by solving a multidimensional dynamic programming problem. 5. The method of claim 1 , comprising: executing the sub-rules to classify examined data, wherein the subset of sub-rules are executed according to the order of execution. 6. The method of claim 5 , wherein the sub-rules are executed using nondeterministic finite automaton. 7. The method of claim 5 , wherein the examined data pertains to a computer database and the data sample comprises a portion of the database. 8. The method of claim 1 , wherein testing the sub-rules in the data sample comprises executing the sub-rules against the data sample. 9. The method of claim 1 , wherein the regular expression rules comprise sequence of characters that define a search pattern for finding a pattern of data. 10. The method of claim 1 , comprising, finding a new order of execution if the probability of the subset of sub-rules that they appear in the examined data is significantly different from the probability of the subset of sub-rules that they appear in the sample data. 11. A computer implemented method for optimizing execution of regular expressions, each comprising a plurality of sections, the method comprising: executing, by a processor, the sections in a data sample that includes the regular expressions; measuring, by a processor, a probability of matches for each section of a portion of the sections in a data sample, processing time of every section of the portion of the sections, and processing time of every regular expression; and solving, by a processor, an optimization problem to find an order of executing the sections of the portion of the sections to minimize the total run time, according to a probability of co-occurrence of at least one section among different regular expressions, the processing time of each of the sections of the portion of sections, the processing time of the regular expressions, and the number of regular expression rules that include each of the sections of the portion of sections, wherein a section that requires short processing time or lower processing power is prioritized over a section that requires longer processing time or higher processing power, a section that rarely appears or matches in the data sample is prioritized over a section that appears or matches in the data sample more frequently, a section that appears in a regular expression rule that requires longer processing time or higher processing power is prioritized over a section that appears in a regular expression rule that requires shorter processing time or lower processing power, and a section that repeats in many regular expression rules is prioritized over a section that repeats in fewer regular expression rules. 12. A system for optimizing execution of regular expression rules, each comprising a plurality of sub-rules, the system comprising: a memory; and a processor configured to: test the sub-rules in a data sample that includes regular expressions; measure, based on the testing, the probability for every sub-rule that it appears in the data sample, and the execution time of each sub-rule; and find an order of execution of at least a subset of the sub-rules to shorten the total execution time of validating the regular expression rules based on a probability of co-occurrence of at least one sub-rule among different regular expressions and the execution time of each of the sub-rules, wherein a subset of the sub-rules that requires short processing time or lower processing power is prioritized over a subset of the sub-rules that requires longer processing time or higher processing power, a subset of the sub-rules that rarely appears or matches in the data sample is prioritized over a subset of the sub-rules that appears or matches in the data sample more frequently, a subset of the sub-males that appears in a regular expression rule that requires longer processing time or higher processing power is prioritized over a subset of the sub-rules that appears in a regular expression rule that requires shorter processing time or lower processing power, and a subset of the sub-rules that repeats in many regular expression rules is prioritized over a subset of the sub-rules that repeats in fewer regular expression rules. 13. The system of claim 12 , wherein the processor is configured to find the order of execution of the sub-rules according to frequency of the sub-rules, wherein the frequency of a sub-rule equals the number of regular expression rules that include the sub-rule. 14. The system of claim 12 , wherein the processor is configured to: measure execution time of the regular expression rules; and find the order of execution of the subset of sub-rules according to the execution time of the regular expression rules. 15. The system of claim 12 , wherein the processor is configured to find the order of executing the subset of sub-rules by solving a multidimensional dynamic programming problem. 16. The system of claim 12 , wherein the processor is configured to: execute the sub-rules to classify examined data, wherein the subset of sub-rules are executed according to the order of execution. 17. The system of claim 15 , wherein the processor is configured to execute the sub-rules by using nondeterministic finite automaton. 18. The system of claim 15 , wherein the examined data pertains to a computer database and the data sample comprises a portion of the database. 19. The system of claim 12

Assignees

Inventors

Classifications

G06N20/00
Machine learning · CPC title
G06N5/01
Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title
G06F16/90344Primary
by using string matching techniques · CPC title
G06N3/006
based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO] · CPC title
G06N5/046
Forward inferencing; Production systems · CPC title

Patent family

Related publications grouped by family.

View patent family 78608152

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11645337B2 cover?: A computer-based system and method for optimizing execution of regular expression rules, each including one or more sub-rules, may include: testing, by a processor, the sub-rules against a data sample; measuring, by a processor and based on the testing, the probability for every sub-rule that it appears in the data sample, and the processing time of each sub-rule; and finding, by a processor, a…
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification G06F16/90344. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue May 09 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Detecting at least one predetermined pattern in stream of symbols

Regular expression matching

Ordering regular expressions

Detection and prevention of sensitive information leaks

Frequently asked questions