Facilitating custom content extraction rule configuration for remote capture agents

US11115505B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11115505-B2
Application numberUS-201916404644-A
CountryUS
Kind codeB2
Filing dateMay 6, 2019
Priority dateJan 29, 2015
Publication dateSep 7, 2021
Grant dateSep 7, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The disclosed embodiments provide a system for extracting custom content from network packets. During operation, the system receives a stream of packets. The system then parses packets in the stream to determine a protocol for each packet. Next, the system applies a custom-content-extraction rule to each packet associated with a target protocol to obtain the extracted content. Then, the system stores the extracted content in events in a data store to facilitate subsequent queries involving the extracted content.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method, comprising: receiving, via a graphical user interface (GUI), input defining a custom content extraction rule, wherein the input specifies: a source field in network packets to be monitored by a remote capture agent, wherein the source field contains structured data, an extraction rule to be used to extract data from the structured data to obtain extracted data, and a field name to be used to identify the extracted data in timestamped events to be generated by the remote capture agent; generating configuration information based on the input; sending the configuration information to the remote capture agent, wherein the configuration information causes the remote capture agent to generate timestamped events, wherein the timestamped events include extracted data obtained by applying the custom content extraction rule to network packets monitored by the remote capture agent, and wherein the extracted data is identified in the timestamped events using the field name; receiving the timestamped events from the remote capture agent, wherein each of the timestamped events includes extracted data identified by the field name; and storing the timestamped events in a data store, wherein storage of the timestamped events in the data store enables execution of queries based on the field name. 2. The computer-implemented method of claim 1 , wherein the remote capture agent generates a timestamped event of the timestamped events at least in part by: parsing a network packet of the network packets to identify a structure of the network packet, wherein the structure of the network packet is used to determine a protocol associated with the network packets; applying the extraction rule to the network packet to obtain extracted content, wherein applying the extraction rule includes: identifying the source field in the network packet containing the structured data from which the extracted content is to be obtained, and extracting data from the structured data contained in the source field of the network packet; generating a timestamped event including a field storing the extracted content; and sending the timestamped event including the extracted content to another component on a computer network for storage in a data store, the data store facilitating querying of timestamped event data stored in the data store using late-binding schemas generated from received queries. 3. The computer-implemented method of claim 1 , wherein the method further comprises: storing the timestamped events in a data store; receiving a query to be applied to the timestamped events stored in the data store; retrieving timestamped events from the data store satisfying the query; using a late-binding schema generated from the query to retrieve data values from the retrieved timestamped events; and processing the query using the retrieved data values. 4. The computer-implemented method of claim 1 , wherein the input further specifies a protocol to be associated with the custom content extraction rule. 5. The computer-implemented method of claim 1 , wherein the input further specifies a field-specific regular expression to be applied to the source field in the network packets, and wherein applying the custom content extraction rule to the network packets includes applying the field-specific regular expression to the source field in the network packets. 6. The computer-implemented method of claim 1 , wherein the structured data includes eXtensible Markup Language (XML) formatted data, and wherein applying the custom content extraction rule includes extracting data from the XML-formatted data. 7. The computer-implemented method of claim 1 , wherein the structured data includes JavaScript Object Notation (JSON) formatted data, and wherein applying the custom content extraction rule includes extracting data from the JSON-formatted data. 8. The computer-implemented method of claim 1 , wherein the custom content extraction rule is associated with a protocol, and wherein the remote capture agent uses a deep-packet inspection engine to determine that the network packets are associated with the protocol. 9. The computer-implemented method of claim 1 , wherein the input further specifies an extraction rule type that identifies a type of extraction rule to be used to obtain the extracted data. 10. The computer-implemented method of claim 1 , wherein the configuration information causes the remote capture agent to send the timestamped events to another component for storage in a data store. 11. A non-transitory computer-readable storage medium storing instructions which, when executed by one or more processors, cause performance of operations comprising: receiving, via a graphical user interface (GUI), input defining a custom content extraction rule, wherein the input specifies: a source field in network packets to be monitored by a remote capture agent, the source field containing structured data, second input specifying an extraction rule to be used to extract data from the structured data to obtain extracted data, and a field name to be used to identify the extracted data in timestamped events to be generated by the remote capture agent; generating configuration information based on the input; sending the configuration information to the remote capture agent, wherein the configuration information causes the remote capture agent to generate timestamped events, wherein the timestamped events include extracted data obtained by applying the custom content extraction rule to network packets monitored by the remote capture agent, and wherein the extracted data is identified in the timestamped events using the field name receiving the timestamped events from the remote capture agent, wherein each of the timestamped events includes extracted data identified by the field name; and storing the timestamped events in a data store, wherein storage of the timestamped events in the data store enables execution of queries based on the field name. 12. The non-transitory computer-readable storage medium of claim 11 , wherein the remote capture agent generates the timestamped events at least in part by: parsing a network packet of the network packets to identify a structure of the network packet, wherein the structure of the network packet is used to determine a protocol associated with the network packet; applying the extraction rule to the network packet to obtain extracted content, wherein applying the extraction rule includes: identifying the source field in the network packet containing the structured data from which the extracted content is to be obtained, and extracting data from the structured data contained in the source field of the network packet; generating a timestamped event including a field storing the extracted content; and sending the timestamped event including the extracted content to another component on a computer network for storage in a data store, the data store facilitating querying of timestamped event data stored in the data store using late-binding schemas generated from received queries. 13. The non-transitory computer-readable storage medium of claim 11 , wherein the instructions, when executed by the one or more processors, further cause performance of operations comprising: storing the timestamped events in a data store; receiving a query to be applied to the timestamped events stored in the data store; retrieving timestamped events from the data store satisfying the query; using a late-binding schema generated from the query to retrieve data values from the retrieved timestamped events; and proce

Assignees

Inventors

Classifications

  • by filtering · CPC title

  • Network utilisation, e.g. volume of load or congestion level · CPC title

  • H04L69/22Primary

    Parsing or analysis of headers · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11115505B2 cover?
The disclosed embodiments provide a system for extracting custom content from network packets. During operation, the system receives a stream of packets. The system then parses packets in the stream to determine a protocol for each packet. Next, the system applies a custom-content-extraction rule to each packet associated with a target protocol to obtain the extracted content. Then, the system …
Who is the assignee on this patent?
Splunk Inc
What technology area does this patent fall under?
Primary CPC classification H04L69/22. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Sep 07 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).