Determining records generated by a processing task of a query

US11599541B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11599541-B2
Application numberUS-201916398044-A
CountryUS
Kind codeB2
Filing dateApr 29, 2019
Priority dateSep 26, 2016
Publication dateMar 7, 2023
Grant dateMar 7, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods are described for determining a quantity of records generated by a processing task of a query executed in a data intake and query. The system receives a query and identifies a processing task of the query and a quantity of records to be processed according to the query. The system determines the number of records generated by the processing task based on the number of records to be processed and a record generation estimate. The system can allocate compute resources or determine a query execution time for at least a portion of the query based on the determined quantity of records generated.

First claim

Opening claim text (preview).

What is claimed: 1. A method, comprising: receiving a query, wherein the query identifies a set of data and a manner of processing the set of data; identifying a processing task of the query and a quantity of records to be processed according to the processing task; determining a quantity of records generated by the processing task based on the quantity of records to be processed and a record generation estimate; and at least one of: allocating compute resources for at least a portion of the query based on the determined quantity of records generated by the processing task; or estimating a processing time for the at least a portion of the query based on the determined quantity of records generated by the processing task. 2. The method of claim 1 , wherein the records to be processed are based on events stored in a data store, each event storing a portion of raw machine data associated with a timestamp. 3. The method of claim 1 , wherein the processing task is an extraction rule. 4. The method of claim 1 , wherein the processing task is a data transform. 5. The method of claim 1 , wherein the processing task is configured for execution by one or more worker nodes. 6. The method of claim 1 , wherein identifying the processing task comprises parsing the query to identify a command. 7. The method of claim 1 , wherein identifying the processing task comprises parsing the query to identify a command that generates more records than received. 8. The method of claim 1 , wherein the records to be processed correspond to records received from one or more indexers of a data intake and query system. 9. The method of claim 1 , wherein the records to be processed correspond to records generated by a preceding processing task. 10. The method of claim 1 , wherein determining the quantity of records generated comprises multiplying the quantity of records to be processed by the record generation estimate. 11. The method of claim 1 , wherein determining the quantity of records generated comprises identifying the record generation estimate from a plurality of record generation estimates based on a time range associated with the query and an identification of the processing task. 12. The method of claim 1 , wherein the record generation estimate is obtained from a lookup table that stores a plurality of record generation estimates, and wherein the lookup table stores a time range, processing task identifier, data sourcetype, and record generation ratio for each record generation estimate of the plurality of record generation estimates. 13. The method of claim 1 , wherein the at least a portion of the query corresponds to the processing task. 14. The method of claim 1 , wherein the at least a portion of the query corresponds to the portion of the query that is configured to be executed by one or more worker nodes of a data intake and query system. 15. The method of claim 1 , further comprising allocating the compute resources for the query based on the determined quantity of records generated by the processing task. 16. The method of claim 1 , further comprising allocating the compute resources for the at least a portion of the query, wherein allocating the compute resources comprises allocating the compute resources based the determined quantity of records generated by the processing task and a priority level assigned to the query. 17. The method of claim 1 , further comprising allocating the compute resources for the at least a portion of the query, wherein allocating the compute resources comprises allocating one or more worker nodes to execute a portion of the query based on the determined quantity of records generated by the processing task. 18. The method of claim 1 , further comprising allocating the compute resources for the at least a portion of the query, wherein allocating the compute resources comprises allocating one or more processors to one or more worker nodes to execute a portion of the query based on the determined quantity of records generated by the processing task. 19. The method of claim 1 , wherein the processing task is a first processing task, the quantity of records to be processed is a first quantity of records to be processed, the quantity of records generated is a first quantity of records generated, and the record generation estimate is a first record generation estimate, the method further comprising: identifying a second processing task of the query and a second quantity of records to be processed according to the second processing task; determining a second quantity of records generated by the second processing task based on the second quantity of records to be processed and a second record generation estimate; and allocating compute resources for the query based on the first quantity of records generated and the second quantity of records generated. 20. The method of claim 1 , wherein the processing task is a first processing task, the quantity of records to be processed is a first quantity of records to be processed, the quantity of records generated is a first quantity of records generated, and the record generation estimate is a first record generation estimate, the method further comprising: identifying a second processing task of the query and a second quantity of records to be processed according to the second processing task; determining a second quantity of records generated by the second processing task based on the second quantity of records to be processed and a second record generation estimate; and allocating compute resources for the query based on a larger of the first quantity of records generated and the second quantity of records generated. 21. The method of claim 1 , wherein the processing task is a first processing task, the quantity of records generated is a first quantity of records generated, and the record generation estimate is a first record generation estimate, the method further comprising: identifying a second processing task of the query; determining a second quantity of records generated by the second processing task based on the first quantity of records generated and a second record generation estimate; and allocating compute resources for the query based on the first quantity of records generated and the second quantity of records generated. 22. The method of claim 1 , further comprising estimating the processing time for the query. 23. The method of claim 1 , further comprising allocating compute resources for the at least a portion of the query and estimating the processing time for the query based on the determined quantity of records generated by the processing task and the compute resources allocated for the at least a portion of the query. 24. A computing system of a data intake and query system, the computing system comprising: memory; and one or more processing devices coupled to the memory and configured to: receive a query, wherein the query identifies a set of data and a manner of processing the set of data; identify a processing task of the query and a quantity of records to be processed according to the processing task; determine a quantity of records generated by the processing task based on the quantity of records to be processed and a record generation estimate; and at least one of: allocate compute resources for at least a portion of the query based on the determined quantity of records generated by the processing task; or estimate a processing time for

Assignees

Inventors

Classifications

  • Iterative querying; Query formulation based on the results of a preceding query · CPC title

  • using directory or table look-up (use of a directory or look-up table in file systems G06F16/13) · CPC title

  • Management thereof · CPC title

  • Distributed queries · CPC title

  • of sub-queries or views · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11599541B2 cover?
Systems and methods are described for determining a quantity of records generated by a processing task of a query executed in a data intake and query. The system receives a query and identifies a processing task of the query and a quantity of records to be processed according to the query. The system determines the number of records generated by the processing task based on the number of record…
Who is the assignee on this patent?
Splunk Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/2272. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 07 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).