Improved pattern matching machine with mapping table
US-2017032054-A1 · Feb 2, 2017 · US
US10423667B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10423667-B2 |
| Application number | US-201415107237-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 27, 2014 |
| Priority date | Dec 23, 2013 |
| Publication date | Sep 24, 2019 |
| Grant date | Sep 24, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method for generating a pattern matching machine for identifying matches of a plurality of symbol patterns in a sequence of input symbols, the method comprising: providing a state machine of states and directed transitions between states corresponding to the plurality of patterns; applying an Aho-Corasick approach to identify mappings between states in the event of a failure, of the state machine in a state and for an input symbol, to transition to a subsequent state based on the directed transitions of the state machine, characterized in that one of the symbol patterns includes a wildcard symbol, and mappings for one or more states representing pattern symbols including the wildcard symbol are based on an input symbol to be received, by the pattern matching machine in use, to constitute the wildcard.
Opening claim text (preview).
The invention claimed is: 1. A computer implemented method for generating a pattern matching machine for identifying matches of a plurality of symbol patterns in a sequence of input symbols, the method comprising: providing a state machine of states and directed transitions between states corresponding to the plurality of patterns; and applying an Aho-Corasick approach to identify one or more mappings between states in the event of a failure, of the state machine in a state and for an input symbol, to transition to a subsequent state based on the directed transitions of the state machine, wherein one of the symbol patterns includes a wildcard symbol, and wherein at least one of the one or more mappings between states in the event of a failure identifies as its respective subsequent state a state representing pattern symbols including the wildcard symbol based on an input symbol to be received, by the pattern matching machine in use, to constitute the wildcard symbol, the value of the wildcard symbol being not determinate when the one or more mappings is/are generated, the one or more mappings being identified so as to not require literal identity between all symbols in the sequence of input symbols and symbols corresponding to transitions between states leading to a current state. 2. The method of claim 1 , wherein: each mapping is a failure state mapping; states from which transitions corresponding to wildcard symbols originate are ineligible for the identification of one or more mappings in the event of a failure; each said failure state mapping includes a condition based on symbols in a received input symbol sequence, wherein the condition being satisfied identifies the failure state mapping appropriate for that received input symbol sequence provided that that failure state mapping exists; and each state has output that is based on the one or more failure state mappings and that where appropriate identifies one or more failure states based on symbols in the input symbol sequence constituting wildcard symbols in the symbol pattern. 3. The method of claim 1 , wherein each mapping is a failure state mapping; and further comprising omitting failure state mappings for states determined to be exempt from failure state mapping. 4. The method of claim 1 , wherein the one or more mappings are supplemented with logic to account for potentially variable values in the input symbols to be received by the pattern matching machine in use. 5. A computer implemented method for generating a pattern matching component for identifying matches of a plurality of symbol patterns in a sequence of input symbols, the method comprising: providing a state machine of states and directed transitions between states corresponding to the plurality of patterns such that the state machine includes at least a starting state and additional states, each additional state representing a sequence of pattern symbols for one or more patterns; and providing, for each state, a failure function for mapping each of one or more states, as a mapped state, to a new state in the state machine, the new state being identified as a state representing a pattern symbol sequence corresponding to a longest proper suffix of the pattern symbol sequence of the mapped state, and the failure function being operable in response to a determination that all directed transitions from the mapped state are not occasioned by an input character, wherein the new state is identified as the starting state in the absence of a state representing a pattern symbol sequence corresponding to a proper suffix of the pattern symbol sequence of the mapped state; and wherein at least one pattern includes at least one wildcard symbol, the value of the at least one wildcard symbol being not determinate when the one or more mappings is/are generated, the failure function for each state representing a pattern symbol sequence including one or more wildcard symbols maps to one or more new states based on one or more input symbols to be received, by the pattern matching machine in use, to constitute each of the one or more wildcard symbols; and wherein the failure function for each state is provided without requiring literal identity between all symbols in the sequence of input symbols and symbols corresponding to transitions between states leading to a current state. 6. The method of claim 5 further comprising providing an output function for selected states in the state machine, the selected states being states representing a pattern symbol sequence corresponding to one or more matched patterns. 7. The method of claim 5 wherein the pattern matching machine is generated as a software component for execution in a computer system. 8. The method of claim 5 , wherein: states from which transitions corresponding to wildcard symbols originate are ineligible for having failure functions provided therefor; each said failure functions includes a condition based on symbols in a received input symbol sequence, wherein the condition being satisfied identifies the failure function appropriate for that received input symbol sequence provided that that failure function exists; and each state has output that is based on the failure functions and that where appropriate identifies one or more failure functions based on symbols in the input symbol sequence constituting wildcard symbols in the symbol pattern. 9. The method of claim 5 , further comprising: determining whether states are exempt from failure state mapping; and omitting the provision of failure functions for those states determined to be exempt from failure state mapping. 10. A computer implemented method for pattern matching a plurality of symbol patterns in a sequence of input symbols, where at least one symbol pattern includes a wildcard symbol, the method comprising: receiving a pattern matching machine for the plurality of patterns, the pattern matching machine being generated by: (a) providing a state machine of states and directed transitions between states corresponding to the plurality of patterns; and (b) applying an Aho-Corasick approach to identify one or more mappings between states in the event of a failure, of the state machine in a state and for an input symbol, to transition to a subsequent state based on the directed transitions of the state machine, wherein at least one of the one or more mappings between states in the event of a failure identifies as its respective subsequent state a state representing pattern symbols including the wildcard symbol based on an input symbol to be received, by the pattern matching machine in use, to constitute the wildcard symbol, the value of the wildcard symbol being not determinate when the one or more mappings is/are generated, the one or more mappings being identified so as to not require literal identity between all symbols in the sequence of input symbols and symbols corresponding to transitions between states leading to a current state; and executing the pattern matching machine for symbols in the sequence of input symbols so as to trigger an output function for sub-sequences of symbols in the sequence of input symbols matching one or more symbol patterns. 11. A pattern matching machine generator for generating a pattern matching machine for identifying matches of a plurality of symbol patterns in a sequence of input symbols, the pattern matching machine generator comprising: a goto function generator component configured to provide a state machine of states and directed transitions between states corresponding to the plurality of patterns; and a failure function generator component configured to apply an Aho-Corasick approach to identify one or more mappings betwee
Selecting, i.e. obtaining data of one kind from those record carriers which are identifiable by data of a second kind from a mass of ordered or randomly- distributed record carriers · CPC title
by using string matching techniques · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.