Generating a subquery for an external data system using a configuration file

US10956415B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10956415-B2
Application numberUS-201816147165-A
CountryUS
Kind codeB2
Filing dateSep 28, 2018
Priority dateSep 26, 2016
Publication dateMar 23, 2021
Grant dateMar 23, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods are disclosed for receiving, at a data intake and query system, a query that includes an indication to process data managed by a third-party data storage and processing system that supports a different query language than the data intake and query system. The data intake and query system identifies a third-party data storage and processing system that manages the data to be processed and generates a subquery for execution by the third-party data storage and processing system, generates instructions for one or more worker nodes to receive and process results of the subquery from the third-party data storage and processing system, and instructs the worker nodes to provide results of the processing to the data intake and query system.

First claim

Opening claim text (preview).

What is claimed: 1. A method, comprising: receiving, at a data intake and query system, a query identifying a set of data to be processed and a manner of processing the set of data; identifying a search parameter in the query associated with a search of an external data system; parsing a configuration file based on the search parameter; identifying the external data system and a subquery for the external data system based on said parsing the configuration file, the subquery identifying at least a subset of data of the set of data stored by the external data system; defining, by the first data intake and query system, a query processing scheme for obtaining and processing the set of data based on the query and the subquery, wherein defining the query processing scheme comprises generating instructions for one or more worker nodes to receive and process results of the subquery to form processed results and to provide the processed results to the data intake and query system; and executing the query based on the query processing scheme. 2. The method of claim 1 , wherein the query is in a first query processing language and the subquery is in a second query processing language. 3. The method of claim 1 , wherein the query is in a first query processing language and the subquery is in a second query processing language, the method further comprising: generating a converted subquery in the first query processing language based on the subquery, wherein said defining the query processing scheme is based on the converted subquery. 4. The method of claim 1 , wherein the query is in a first query processing language and the subquery is in a second query processing language, the method further comprising: generating a converted subquery in the first query processing language based on the subquery, wherein said defining the query processing scheme is based on the converted subquery; and generating a modified subquery in the second query processing language based on at least a portion of the query processing scheme, wherein executing the query comprises communicating the modified subquery to the external data system. 5. The method of claim 1 , wherein said defining the query processing scheme comprises: determining a data ingest estimate for the subquery, determining a partition size based on resources allocated to the query and one or more search parameters of the subquery, and determining a number of partitions based on the partition size and the data ingest estimate, wherein the instructions for the one or more worker nodes are generated based on the determined number of partitions. 6. The method of claim 1 , wherein said defining the query processing scheme comprises: determining a data ingest estimate for the subquery, determining a partition size based on a number of processors and an amount of memory allocated for the query and one or more search parameters of the subquery, and determining a number of partitions based on the partition size and the data ingest estimate, wherein the instructions for the one or more worker nodes are generated based on the determined number of partitions. 7. The method of claim 1 , wherein said defining the query processing scheme comprises: determining a data ingest estimate for the subquery, determining a partition size based on resources allocated to the query and a number of fields used to process events from the external data system, and determining a number of partitions based on the partition size and the data ingest estimate, wherein the instructions for the one or more worker nodes are generated based on the determined number of partitions. 8. The method of claim 1 , wherein the at least a subset of data is a first subset of data, and the processed results are first processed results, the method further comprising: determining that the set of data includes a second subset of data associated with the data intake and query system, wherein defining the query processing scheme, further comprises: generating a subquery for the data intake and query system, the subquery for the data intake and query system identifying the second subset of data and a manner of processing the second subset of data, and generating instructions for one or more worker nodes to receive and process results of the subquery for the data intake and query system to form second processed results and to provide the second processed results to the data intake and query system. 9. The method of claim 1 , wherein the at least a subset of data is a first subset of data, and the processed results are first processed results, the method further comprising: determining that the set of data includes a second subset of data associated with the data intake and query system, wherein defining the query processing scheme, further comprises: generating a subquery for the data intake and query system, the subquery for the data intake and query system identifying the second subset of data and a manner of processing the second subset of data, and generating instructions for one or more worker nodes to receive and process results of the subquery for the data intake and query system to generate second processed results, to combine and process the first processed results and the second processed results to form combined processed results and to provide the combined processed results to the data intake and query system. 10. The method of claim 1 , wherein the search parameter is a first search parameter, the at least a subset of data is a first subset of data, the processed results are first processed results, and the external data system is a first external data system, the method further comprising: identifying a second search parameter in the query associated with a search of a second external data system; parsing the configuration file based on the second search parameter; identifying the second external data system and a second subquery for the second external data system based on said parsing the configuration file, the second subquery identifying at least a second subset of data of the set of data stored by the second external data system; wherein defining the query processing scheme, further comprises: generating a subquery for the second external data system, the subquery for the second external data system identifying the second subset of data; and generating instructions for one or more worker nodes to receive and process results of the subquery for the second external data system to form second processed results and to provide the second processed results to the data intake and query system. 11. The method of claim 1 , wherein the data intake and query system and the external data system each independently execute queries other than the query. 12. The method of claim 1 , further comprising associating a search identifier with the external data system, wherein the one or more worker nodes process results of the subquery based on the search identifier. 13. The method of claim 1 , wherein defining the query processing scheme further comprises associating, by the data intake and query system, a first search identifier with the external data system, and executing the query comprises: receiving, by the one or more worker nodes, the results of the subquery, wherein the results of the subquery include a second search identifier assigned to the results of the subquery by the external data system; mapping the first search identifier to the second search identifier; and processing the results of the subquery based on said mapping. 14. The method of claim 1 , wherein defining the query processing scheme, fu

Assignees

Inventors

Classifications

  • Distributed queries · CPC title

  • Details of monitoring file system events, e.g. by the use of hooks, filter drivers, logs · CPC title

  • of sub-queries or views · CPC title

  • Iterative querying; Query formulation based on the results of a preceding query · CPC title

  • Selectivity estimation or determination · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10956415B2 cover?
Systems and methods are disclosed for receiving, at a data intake and query system, a query that includes an indication to process data managed by a third-party data storage and processing system that supports a different query language than the data intake and query system. The data intake and query system identifies a third-party data storage and processing system that manages the data to be …
Who is the assignee on this patent?
Splunk Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/2471. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 23 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).