Facilitating modification of an extracted field

US10783318B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10783318-B2
Application numberUS-201715417430-A
CountryUS
Kind codeB2
Filing dateJan 27, 2017
Priority dateSep 7, 2012
Publication dateSep 22, 2020
Grant dateSep 22, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The technology disclosed relates to formulating and refining field extraction rules that are used at query time on raw data with a late-binding schema. The field extraction rules identify portions of the raw data, as well as their data types and hierarchical relationships. These extraction rules are executed against very large data sets not organized into relational structures that have not been processed by standard extraction or transformation methods. By using sample events, a focus on primary and secondary example events help formulate either a single extraction rule spanning multiple data formats, or multiple rules directed to distinct formats. Selection tools mark up the example events to indicate positive examples for the extraction rules, and to identify negative examples to avoid mistaken value selection. The extraction rules can be saved for query-time use, and can be incorporated into a data model for sets and subsets of event data.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method, comprising: receiving a first selection associated with an event of a plurality of events, wherein each event in the plurality of events includes a portion of raw data, and wherein the first selection is of a portion of text within the raw data of the event to be extracted as a value of a field; automatically determining an extraction rule that extracts the selected portion of text as the value of the field; and causing display of an interface to allow user modification of a representation of the value. 2. The method of claim 1 , wherein the user modification includes a concatenation of the value with additional text within the raw data of the event. 3. The method of claim 1 , wherein the user modification includes a concatenation of the value with additional text within the raw data of the event, further comprising: receiving a second selection of the additional text; and updating the extraction rule to combine the value with the additional text. 4. The method of claim 1 , wherein the user modification includes a second selection of a sub-portion of the portion of text, the sub-portion to be trimmed from the value. 5. The method of claim 1 , wherein the user modification includes a second selection of a sub-portion of the portion of text, further comprising updating the extraction rule to trim the sub-portion from the value. 6. The method of claim 1 , wherein the user modification includes a second selection of a sub-portion of the portion of text, the method further comprising updating the extraction rule to extract the sub-portion as the value of the field. 7. The method of claim 1 , further comprising: receiving a second selection of a sub-portion of the portion of text; automatically determining a secondary extraction rule to extract the sub-portion; and updating the extraction rule to include the secondary extraction rule as the value of the field. 8. The method of claim 1 , wherein the modified representation of the value is associated with a second field. 9. The method of claim 1 , wherein each of the plurality of events is associated with a time stamp. 10. The method of claim 1 , further comprising: receiving a second selection of a sampling strategy; and causing display of an annotated version of the plurality of events in response to the second selection. 11. The method of claim 1 , further comprising: receiving a second selection of events that match the extraction rule; sampling according to the second selection; and causing display of an annotated version of the events that match based on the second selection. 12. The method of claim 1 , further comprising: receiving a second selection of one or more examples of text that should not be extracted; and automatically determining an updated extraction rule that does not extract the text that should not be extracted. 13. The method of claim 1 , further comprising: in response to a selection of a selected field, causing display of a frequency table of values of the selected field extracted from a sample of the plurality of events. 14. The method of claim 1 , further comprising: receiving a second selection to save the extraction rule and a field name of the field for later use in processing events; and incorporating the saved extraction rule and field name in a data model that includes a late-binding schema of extraction rules applied at search time. 15. The method of claim 1 , further comprising: receiving a keyword to apply as a filter; resampling according to the keyword; and determining events of the plurality of events to be displayed based on the applied keyword. 16. A computer-implemented system comprising: one or more processors; and one or more non-transitory computer-readable storage media having instructions stored thereon, which, when executed by the one or more processors, cause the computing system to: receive a first selection associated with an event of a plurality of events, wherein each event in the plurality of events includes a portion of raw data, and wherein the first selection is of a portion of text within the raw data of the event to be extracted as a value of a field; automatically determine an extraction rule that extracts the selected portion of text as the value of the field; and cause display of an interface to allow user modification of a representation of the value. 17. The computer-implemented system of claim 16 , wherein the user modification includes a concatenation of the value with additional text within the raw data of the event. 18. The computer-implemented system of claim 16 , wherein the user modification includes a second selection of a sub-portion of the portion of text, the sub-portion to be trimmed from the value. 19. The computer-implemented system of claim 16 , wherein the user modification includes a second selection of a sub-portion of the portion of text, further comprising updating the extraction rule to extract the sub-portion as the value of the field. 20. The computer-implemented system of claim 16 , wherein the one or more non-transitory computer-readable storage media having instructions stored thereon, which, when executed by the one or more processors, cause the computing system to further: receive a second selection of events that match the extraction rule; sample according to the second selection; and cause display of an annotated version of the events that match based on the second selection. 21. The computer-implemented system of claim 16 , wherein the one or more non-transitory computer-readable storage media having instructions stored thereon, which, when executed by the one or more processors, cause the computing system to further: receive a second selection of one or more examples of text that should not be extracted; and automatically determine an updated extraction rule that does not extract the text that should not be extracted. 22. The computer-implemented system of claim 16 , wherein the one or more non-transitory computer-readable storage media having instructions stored thereon, which, when executed by the one or more processors, cause the computing system to further: receive a second selection to save the extraction rule and a field name of the field for later use in processing events; and incorporate the saved extraction rule and field name in a data model that includes a late-binding schema of extraction rules applied at search time. 23. The computer-implemented system of claim 16 , wherein the one or more non-transitory computer-readable storage media having instructions stored thereon, which, when executed by the one or more processors, cause the computing system to further: receive a keyword to apply as a filter; resample according to the keyword; and determine events of the plurality of events to be displayed based on the applied keyword. 24. A tangible computer-readable memory having instructions stored in the memory that implement the actions including: receiving a first selection associated with an event of a plurality of events, wherein each event in the plurality of events includes a portion of raw data, and wherein the first selection is of a portion of text within the raw data of the event to be extracted as a value of a field; automatically determining an extraction rule that extracts the selected portion of text as the value of the field; and causing display of an interface to allow user mo

Assignees

Inventors

Classifications

  • Temporal data queries · CPC title

  • Browsing; Visualisation therefor (for navigating the web G06F16/954; browsing optimisation for the web G06F16/957) · CPC title

  • G06F40/166Primary

    Editing, e.g. inserting or deleting · CPC title

  • of tables; using ruled lines · CPC title

  • Annotation, e.g. comment data or footnotes · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10783318B2 cover?
The technology disclosed relates to formulating and refining field extraction rules that are used at query time on raw data with a late-binding schema. The field extraction rules identify portions of the raw data, as well as their data types and hierarchical relationships. These extraction rules are executed against very large data sets not organized into relational structures that have not bee…
Who is the assignee on this patent?
Splunk Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/2477. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 22 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).