Conversion of structured queries into unstructured queries for searching unstructured data store including timestamped raw machine data

US9916379B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9916379-B2
Application numberUS-201715421429-A
CountryUS
Kind codeB2
Filing dateJan 31, 2017
Priority dateJul 31, 2013
Publication dateMar 13, 2018
Grant dateMar 13, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Technologies are described herein for executing queries expressed with reference to a structured query language against unstructured data. A user issues a structured query through a traditional structured data management (“SDM”) application. Upon receiving the structured query, an SDM driver analyzes the structured query and extracts a data structure from the unstructured data, if necessary. The structured query is then converted to an unstructured query based on the extracted data structure. The converted unstructured query may then be executed against the unstructured data. Results from the query are reorganized into structured data utilizing the extracted data structure and are then presented to the user through the SDM application.

First claim

Opening claim text (preview).

We claim: 1. A computer-implemented method comprising: segmenting unstructured raw machine data into a plurality of events; associating a timestamp with each event of the plurality of events; storing the plurality of events as unstructured data in an unstructured data store with associated timestamps, wherein the unstructured data in the unstructured data store includes the unstructured raw machine data that has been segmented and timestamped; receiving, at a query converter, a structured query in a structured query language from an application; generating, by the query converter, a second query in a second query language associated with the unstructured data store, based on the structured query; causing execution of the second query against the unstructured data stored in the unstructured data store; receiving a result of execution of the second query against the unstructured data stored in the unstructured data store; and causing an indication of the result to be provided to the application for output to a user, wherein the indication of the result is provided to the application as a direct response to the structured query, without requiring any additional query from the application. 2. The computer-implemented method of claim 1 , further comprising: at the query converter, identifying a first set of fields in the unstructured data to obtain field identification data from the unstructured data, the unstructured data including text records, each of the fields in the first set of fields corresponding to a portion of text extracted from a portion of at least one of the text records; wherein generating the second query in the second query language associated with the unstructured data store includes using the identified first set of fields to generate the second query. 3. The computer-implemented method of claim 2 , wherein the second query causes one or more values for one or more fields included in the second query to be extracted as a function of a format of the unstructured data. 4. The computer-implemented method of claim 2 , further comprising caching the identified first set of fields. 5. The computer-implemented method of claim 2 , wherein identifying a first set of fields performs the query on a subset of the unstructured data, and wherein the subset of the unstructured data is of a definable size. 6. The computer-implemented method of claim 2 , wherein identifying a first set of fields automatically identifies fields in the unstructured data as a function of formatting of the unstructured data. 7. The computer-implemented method of claim 2 , wherein the first query comprises a Structured Query Language (“SQL”) query. 8. The computer-implemented method of claim 1 , further comprising: identifying a value of a field in an event stored in the unstructured data store, based on an extraction rule that specifies where to find a subportion of text within an event. 9. A computer-implemented method as recited in claim 1 , further comprising: identifying a field in an event stored in the unstructured data store; wherein generating the second query in the second query language associated with the unstructured data store is based on an identification of the field resulting from said identifying. 10. A computer-readable storage medium having computer-executable instructions stored thereupon which, when executed by a computer, cause the computer to: segment unstructured raw machine data into a plurality of events; associate a timestamp with each event of the plurality of events; store the plurality of events as unstructured data in an unstructured data store with associated timestamps, wherein the unstructured data in the unstructured data store includes the unstructured raw machine data that has been segmented and timestamped; receive, at a query converter, a structured query in a structured query language from an application; generate, by the query converter, a second query in a second query language associated with the unstructured data store, based on the structured query; cause execution of the second query against the unstructured data stored in the unstructured data store; receive a result of execution of the second query against the unstructured data stored in the unstructured data store; and cause an indication of the result to be provided to the application for output to a user, wherein the indication of the result is provided to the application as a direct response to the structured query, without requiring any additional query from the application. 11. The computer-readable storage medium of claim 10 , wherein said instructions further comprise instructions that when executed by the processor, cause the processor to: identify a first set of fields in the unstructured data to obtain field identification data from the unstructured data source, the unstructured data including text records, each of the fields in the first set of fields corresponding to a portion of text extracted from a portion of at least one of the text records; wherein generating the second query in the second query language associated with the unstructured data store includes generating the second query by using the identified first set of fields. 12. The computer-readable storage medium of claim 11 , wherein the second query causes one or more values for one or more fields included in the second query to be extracted as a function of a format of the unstructured data. 13. The computer-readable storage medium of claim 11 , further comprising computer-executable instructions stored thereupon which, when executed by a computer, cause the computer to: cache the identified first set of fields. 14. The computer-readable storage medium of claim 11 , wherein querying of the unstructured data to identify the first set of fields is performed on a subset of the unstructured data, and wherein the subset of the unstructured data is of a definable size. 15. The computer-readable storage medium of claim 11 , wherein the first set of fields is identified by automatically identifying fields in the unstructured data as a function of formatting of the unstructured data. 16. The computer-readable storage medium of claim 11 , wherein the first query comprises a Structured Query Language (“SQL”) query. 17. A system comprising: a processor and instructions in memory coupled to the processor that, when executed by the processor, cause the system to perform operations comprising: segmenting unstructured raw machine data into a plurality of events; associating a timestamp with each event of the plurality of events; storing the plurality of events as unstructured data in an unstructured data store with associated timestamps, wherein the unstructured data in the unstructured data store includes the unstructured raw machine data that has been segmented and timestamped; receiving, at a query converter, a structured query in a structured query language from an application; generating, by the query converter, a second query in a second query language associated with the unstructured data store, based on the structured query; causing execution of the second query against the unstructured data stored in the unstructured data store; receiving a result of execution of the second query against the unstructured data stored in the unstructured data store; and causing an indication of the result to be provided to the application for output to a user, wherein the indication of the result is provided to the application as a direct response to the structured query, without requiring any additional query from th

Assignees

Inventors

Classifications

  • Physics · mapped topic

  • Physics · mapped topic

  • G06F16/80Primary

    of semi-structured data, e.g. markup language structured data such as SGML, XML or HTML (content-based retrieval of web data G06F16/95) · CPC title

  • Query processing · CPC title

  • Query processing · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9916379B2 cover?
Technologies are described herein for executing queries expressed with reference to a structured query language against unstructured data. A user issues a structured query through a traditional structured data management (“SDM”) application. Upon receiving the structured query, an SDM driver analyzes the structured query and extracts a data structure from the unstructured data, if necessary. Th…
Who is the assignee on this patent?
Splunk Inc
What technology area does this patent fall under?
Primary CPC classification G06F17/3066. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 13 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).