Parallel exporting in a data fabric service system

US10599723B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10599723-B2
Application numberUS-201615339835-A
CountryUS
Kind codeB2
Filing dateOct 31, 2016
Priority dateSep 26, 2016
Publication dateMar 24, 2020
Grant dateMar 24, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The disclosed embodiments include techniques for exporting partial search results in parallel from peer indexers of a data intake and query system to the worker nodes. In particular, partial search results (e.g., time-indexed events) obtained from peer indexers can be exported in parallel from the peer indexers to worker nodes. Exporting the partial search results from the peer indexers in parallel can improve the rate at which the partial search results are transferred to the worker nodes for subsequent combination with partial search results of the external data systems. As such, the rate at which the search results of a search query can be obtained from the distributed data system can be improved by implementing parallel export techniques.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for processing a search query, the method comprising: receiving a subquery corresponding to a portion of a query, the query received by a data intake and query system; obtaining a plurality of first events based on the subquery, each first event corresponding to at least one second event stored in a subset of internal data sources of the data intake and query system, wherein each second event includes raw machine data associated with a timestamp and reflects activity within an information technology infrastructure; generating a plurality of event chunks from the plurality of first events, wherein each event chunk comprises multiple first events of the plurality of first events; and concurrently transmitting a first event chunk of the plurality of event chunks to a first worker node and a second event chunk of the plurality of event chunks to a second worker node for additional processing. 2. The computer-implemented method of claim 1 , wherein the at least one partial search result is plurality of first events based on the subquery are obtained from one or more internal data sources associated with the data intake and query system. 3. The computer-implemented method of claim 1 , wherein the obtaining the plurality of first events based on the subquery, the generating the plurality of event chunks from the plurality of first events, and the concurrently transmitting the first event chunk of the plurality of event chunks to the first worker node and the second event chunk of the plurality of event chunks to the second worker node are performed by a first indexer associated with a data intake and query system. 4. The computer-implemented method of claim 1 , wherein the obtaining the plurality of first events based on the subquery, the generating the plurality of event chunks from the plurality of first events, and the concurrently transmitting the first event chunk of the plurality of event chunks to the first worker node and the second event chunk of the plurality of event chunks to the second worker node are performed by each indexer included in a plurality of indexers associated with a data intake and query system. 5. The computer-implemented method of claim 1 , wherein obtaining the plurality of first events based on the subquery comprises applying at least one policy to the plurality of first events, the at least one policy being based on at least one of one or more hash values and one or more field names. 6. The computer-implemented method of claim 1 , further comprising determining that the first worker node and the second worker node are available to process the first event chunk and the second event chunk, respectively. 7. The computer-implemented method of claim 1 , wherein instructions are transmitted to the first worker node and the second worker node via a data fabric service master associated with the data intake and query system, the instructions indicating to the first worker node and the second worker node how to process the first event chunk and the second event chunck, respectively. 8. The computer-implemented method of claim 7 , wherein, based on the instructions, the first worker node and the second worker node analyze timestamps associated with the first event chunk and second event chunk, respectively, to generate time-ordered event chunks, and stream the time-ordered event chunks to the data fabric service master. 9. The computer-implemented method of claim 7 , wherein, based on the instructions, the first worker node aggregates the first event chunk with other event chunks received by the first worker node to generate aggregated partial search results associated with one or more internal data sources included within the data intake and query system. 10. The computer-implemented method of claim 9 , wherein, based on the instructions, the first worker node further aggregates the generated aggregated partial search results associated with the one or more internal data sources with partial search results associated with one or more external data sources to generate finalized partial search results, and transmits the finalized partial search results to the data fabric service master. 11. A non-transitory computer-readable medium including instructions that, when executed by a processor included in an indexer, cause the processor to perform the steps of: receiving a subquery corresponding to a portion of a query, the query received by a data intake and query system; obtaining a plurality of first events based on the subquery, each first event corresponding to at least one second event stored in a subset of internal data sources of the data intake and query system, wherein each second event includes raw machine data associated with a timestamp and reflects activity within an information technology infrastructure; generating a plurality of event chunks from the plurality of first events, wherein each event chunk comprises multiple first events of the plurality of first events; and concurrently transmitting a first event chunk of the plurality of event chunks to a first worker node and a second event chunk of the plurality of event chunks to a second worker node for additional processing. 12. The non-transitory computer-readable medium of claim 11 , wherein the plurality of first events based on the subquery are obtained from one or more internal data sources associated with the data intake and query system. 13. The non-transitory computer-readable medium of claim 11 , wherein the indexer is associated with the data intake and query system. 14. The non-transitory computer-readable medium of claim 13 , wherein the steps of obtaining the plurality of first events based on the subquery, generating the plurality of event chunks from the plurality of first events, and concurrently transmitting the first event chunk of the plurality of event chunks to the first worker node and the second event chunk of the plurality of event chunks to the second worker node are also performed by at least one other indexer included in the data intake and query system. 15. The non-transitory computer-readable medium of claim 11 , wherein obtaining the plurality of first events based on the subquery comprises applying at least one policy to the plurality of first events, the at least one policy being based on at least one of one or more hash values and one or more field names. 16. The non-transitory computer-readable medium of claim 11 , further comprising determining that the first worker node and the second worker node are available to process the respective first or second event chunk of the plurality of event chunks. 17. The non-transitory computer-readable medium of claim 11 , wherein instructions are transmitted to the first worker node and the second worker node via a data fabric service master associated with the data intake and query system, the instructions indicating to the first worker node and the second worker node how to process the respective first or second event chunk of the plurality of event chunks. 18. The non-transitory computer-readable medium of claim 17 , wherein, based on the instructions, the first worker node and the second worker node analyze timestamps associated with the first or second event chunk, respectively, to generate time-ordered event chunks, and stream the time-ordered event chunks to the data fabric service master. 19. The non-transitory computer-readable medium of claim 17 , wherein, based on the instructions, the first worker node aggregates the first event chunk w

Assignees

Inventors

Classifications

  • Browsing; Visualisation therefor (for navigating the web G06F16/954; browsing optimisation for the web G06F16/957) · CPC title

  • Presentation of query results · CPC title

  • with details for data modelling support · CPC title

  • between a Database Management System and a front-end application · CPC title

  • Distributed queries · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10599723B2 cover?
The disclosed embodiments include techniques for exporting partial search results in parallel from peer indexers of a data intake and query system to the worker nodes. In particular, partial search results (e.g., time-indexed events) obtained from peer indexers can be exported in parallel from the peer indexers to worker nodes. Exporting the partial search results from the peer indexers in para…
Who is the assignee on this patent?
Splunk Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/951. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 24 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).