Providing extraction results for a particular field

US2021174009A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2021174009-A1
Application numberUS-202117169254-A
CountryUS
Kind codeA1
Filing dateFeb 5, 2021
Priority dateSep 7, 2012
Publication dateJun 10, 2021
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The technology disclosed relates to formulating and refining field extraction rules that are used at query time on raw data with a late-binding schema. The field extraction rules identify portions of the raw data, as well as their data types and hierarchical relationships. These extraction rules are executed against very large data sets not organized into relational structures that have not been processed by standard extraction or transformation methods. By using sample events, a focus on primary and secondary example events help formulate either a single extraction rule spanning multiple data formats, or multiple rules directed to distinct formats. Selection tools mark up the example events to indicate positive examples for the extraction rules, and to identify negative examples to avoid mistaken value selection. The extraction rules can be saved for query-time use, and can be incorporated into a data model for sets and subsets of event data.

First claim

Opening claim text (preview).

1 . A computer-implemented method comprising: identifying a field associated with a set of events; identifying a set of unique values corresponding with the field in the set of events; determining a count of a number of times each unique value corresponds with the field in the set of events; causing concurrent display of the set of unique values corresponding with the field in the set of events and the counts of the number of times the unique value corresponds with the field in the set of events. 2 . The method of claim 1 , wherein the field is a field for which values are extracted from the set of events. 3 . The method of claim 1 , wherein each event in the set of events includes a portion of raw data. 4 . The method of claim 1 , wherein the set of events comprise a sample set of events selected by a user. 5 . The method of claim 1 , wherein the count comprises the number of times each unique value was extracted from the field in the set of events, the set of events having a plurality of fields. 6 . The method of claim 1 further comprising: determining a percent of the number of times each unique value corresponds with the field in the set of events relative to a total number of events in the set of events; and causing concurrent display of the percent with the set of unique values and the counts. 7 . The method of claim 1 further comprising: determining a percent of the number of times each unique value corresponds with the field in the set of events relative to a total number of events in the set of events; and causing concurrent display of a graphical illustration representing the percent along with the set of unique values and the counts. 8 . The method of claim 1 , wherein the displayed set of unique values are presented in rows in a first column and the counts are displayed in a second column in a corresponding row. 9 . The method of claim 1 , wherein the concurrent display includes row controls to populate a filter control with a selected value. 10 . The method of claim 1 , wherein each unique value in the concurrent display includes a link to populate a filter control with the corresponding unique value. 11 . The method of claim 1 further comprising: identifying a selection of a link of a displayed unique value; and causing presentation of the set of events with a key-value filter set in a filter control that corresponds with the selected displayed unique value. 12 . The method of claim 1 further comprising determining a coverage within a source type, the coverage comprising a percent of tokens within events of the source type that have extracted values. 13 . The method of claim 1 further comprising determining a confidence rating within a source type, the confidence rating indicating insight into an estimated success of existing extractions. 14 . A system comprising: one or more data processors; and one or more computer-readable storage media containing instructions which when executed on the one or more data processors, cause the one or more processors to perform operations including: identifying a field associated with a set of events; identifying a set of unique values corresponding with the field in the set of events; determining a count of a number of times each unique value corresponds with the field in the set of events; causing concurrent display of the set of unique values corresponding with the field in the set of events and the counts of the number of times the unique value corresponds with the field in the set of events. 15 . The system of claim 14 , wherein the count comprises the number of times each unique value was extracted from the field in the set of events, the set of events having a plurality of fields. 16 . The system of claim 14 further comprising: determining a percent of the number of times each unique value corresponds with the field in the set of events relative to a total number of events in the set of events; and causing concurrent display of the percent with the set of unique values and the counts. 17 . The system of claim 14 further comprising: determining a percent of the number of times each unique value corresponds with the field in the set of events relative to a total number of events in the set of events; and causing concurrent display of a graphical illustration representing the percent along with the set of unique values and the counts. 18 . One or more computer-storage media storing computer-executable instructions that, when executed by a computing device, perform a method for generating an extraction rule, the method comprising: identifying a field associated with a set of events; identifying a set of unique values corresponding with the field in the set of events; determining a count of a number of times each unique value corresponds with the field in the set of events; causing concurrent display of the set of unique values corresponding with the field in the set of events and the counts of the number of times the unique value corresponds with the field in the set of events. 19 . The one or more computer-storage media of claim 18 further comprising: identifying a selection of a link of a displayed unique value; and causing presentation of the set of events with a key-value filter set in a filter control that corresponds with the selected displayed unique value. 20 . The one or more computer-storage media of claim 18 further comprising: determining a coverage within a source type, the coverage comprising a percent of tokens within events of the source type that have extracted values.

Assignees

Inventors

Classifications

  • Temporal data queries · CPC title

  • Recognition of textual entities · CPC title

  • G06F40/174Primary

    Form filling; Merging · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2021174009A1 cover?
The technology disclosed relates to formulating and refining field extraction rules that are used at query time on raw data with a late-binding schema. The field extraction rules identify portions of the raw data, as well as their data types and hierarchical relationships. These extraction rules are executed against very large data sets not organized into relational structures that have not bee…
Who is the assignee on this patent?
Splunk Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/2477. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jun 10 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).