Deep packet inspection (DPI) of network packets for keywords of a vocabulary

US9680797B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9680797-B2
Application numberUS-201414288414-A
CountryUS
Kind codeB2
Filing dateMay 28, 2014
Priority dateMay 28, 2014
Publication dateJun 13, 2017
Grant dateJun 13, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An aspect of the present disclosure provides deep packet inspection (DPI) of network packets for keywords of a vocabulary. In one embodiment, a mapping specifying association of respective keywords to corresponding unique pattern codes is maintained, with each pattern code being shorter in length compared to the corresponding keyword and being computed based on a formula. Upon receiving a network packet, a token (containing a sequence of characters) present in the network packet is first identified and the formula then applied to the identified token to generate a token code. The token is determined to match a specific keyword when the token code equals the pattern code corresponding to the specific keyword in the mapping.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of processing network packets, said method comprising: maintaining a mapping specifying association of each of a plurality of keywords with a corresponding pattern code of a plurality of unique pattern codes, each pattern code being shorter in length compared to the corresponding keyword and being computed based on a formula; receiving a network packet; identifying a first token present in said network packet, said first token containing a first sequence of characters; applying said formula to said first token to generate a first token code; and determining that said first token matches a first keyword when said first token code equals a first pattern code corresponding to said first keyword in said mapping, wherein each pattern code, including said first pattern code, and said first token code are computed as respective numerical values based on said formula, wherein a match is determined based on the numerical values of the first token code and the first pattern code being equal. 2. The method of claim 1 , wherein said formula is a positional formula which computes a numerical value based on a value of each character in a sequence of characters and a corresponding weight assigned to the position of the character in said sequence of characters, wherein each keyword and each token is in the form of a corresponding sequence of characters, wherein the pattern code of each keyword is computed as said numerical value by applying said positional formula to the sequence of characters constituting the keyword and the token code of each token is computed as said numerical value by applying said positional formula to the sequence of characters constituting the token. 3. The method of claim 2 , wherein each character is represented by a first number of bits, wherein a first keyword of said plurality of keywords is represented by a corresponding number of bits equal to the product of the first number of bits and the number of characters in the corresponding sequence of characters constituting said first keyword, wherein each of said plurality of pattern codes is represented by a second number of bits, wherein said second number of bits is less than the corresponding number of bits. 4. The method of claim 2 , wherein said identifying, said applying and said determining together comprises: forming successive tokens, with each successive token containing an additional character compared to a prior token; and for each of said successive tokens, checking whether or not each successive token equals one of said plurality of keywords. 5. The method of claim 4 , wherein said maintaining also maintains a lowest value and a highest value among said plurality of unique pattern codes, wherein said checking for each successive token comprises: calculating a corresponding token code for the successive token; and if the corresponding token code is between said lowest value and said highest value, comparing the corresponding token code as said first token code with at least some of said plurality of unique pattern codes of said mapping, and otherwise excluding the corresponding token code from said comparing. 6. The method of claim 5 , wherein said maintaining also maintains a list of lengths of said plurality of keywords, wherein said checking for each successive token further comprises: identifying whether the length of the successive token is contained in said list of lengths, wherein said comparing is performed with the corresponding token code of the successive token as said first token code, in response to said length of the successive token being identified as being contained in said list of lengths. 7. The method of claim 2 , wherein said positional formula is: c = ∑ x = 0 n ⁢ a ⁢ ⁢ s ⁢ ⁢ c ⁢ ⁢ i ⁢ ⁢ i ⁡ ( C x ) * 128 x wherein C x represents the character at position x in a first sequence of characters, n is equal to the number of characters in said first sequence of characters, ascii(k) is a function that returns the ASCII value of the character k, C is the corresponding pattern code when said first sequence of characters represents a keyword, and is the corresponding token code when said first sequence of characters represents a token. 8. The method of claim 7 , wherein said plurality of keywords are specified according to a MIME (Multipurpose Internet Mail Extensions) based protocol. 9. A non-transitory machine readable medium storing one or more sequences of instructions for enabling a system to process network packets, wherein execution of said one or more instructions by one or more processors contained in said system causes said system to perform the actions of: maintaining a mapping specifying association of each of a plurality of keywords with a corresponding pattern code of a plurality of unique pattern codes, each pattern code being shorter in length compared to the corresponding keyword and being computed based on a formula; receiving a network packet; identifying a first token present in said network packet, said first token containing a first sequence of characters; applying said formula to said first token to generate a first token code; and determining that said first token matches a first keyword when said first token code equals a first pattern code corresponding to said first keyword in said mapping, wherein each pattern code, including said first pattern code, and said first token code are computed as respective numerical values based on said formula, wherein a match is determined based on the numerical values of the first token code and the first pattern code being equal. 10. The machine readable medium of claim 9 , wherein said formula is a positional formula which computes a numerical value based on a value of each character in a sequence of characters and a corresponding weight assigned to the position of the character in said sequence of characters, wherein each keyword and each token is in the form of a corresponding sequence of characters, wherein the pattern code of each keyword is computed as said numerical value by applying said positional formula to the sequence of characters constituting the keyword and the token code of each token is computed as said numerical value by applying said positional formula to the sequence of characters constituting the token. 11. The machine readable medium of claim 10 , wherein each character is represented by a

Assignees

Inventors

Classifications

  • Filtering by information in the payload · CPC title

  • Configuration setting · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9680797B2 cover?
An aspect of the present disclosure provides deep packet inspection (DPI) of network packets for keywords of a vocabulary. In one embodiment, a mapping specifying association of respective keywords to corresponding unique pattern codes is maintained, with each pattern code being shorter in length compared to the corresponding keyword and being computed based on a formula. Upon receiving a netwo…
Who is the assignee on this patent?
Oracle Int Corp
What technology area does this patent fall under?
Primary CPC classification H04L63/0245. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Jun 13 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).