Stream data multiprocessing method
US-2015149507-A1 · May 28, 2015 · US
US11599541B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11599541-B2 |
| Application number | US-201916398044-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 29, 2019 |
| Priority date | Sep 26, 2016 |
| Publication date | Mar 7, 2023 |
| Grant date | Mar 7, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods are described for determining a quantity of records generated by a processing task of a query executed in a data intake and query. The system receives a query and identifies a processing task of the query and a quantity of records to be processed according to the query. The system determines the number of records generated by the processing task based on the number of records to be processed and a record generation estimate. The system can allocate compute resources or determine a query execution time for at least a portion of the query based on the determined quantity of records generated.
Opening claim text (preview).
What is claimed: 1. A method, comprising: receiving a query, wherein the query identifies a set of data and a manner of processing the set of data; identifying a processing task of the query and a quantity of records to be processed according to the processing task; determining a quantity of records generated by the processing task based on the quantity of records to be processed and a record generation estimate; and at least one of: allocating compute resources for at least a portion of the query based on the determined quantity of records generated by the processing task; or estimating a processing time for the at least a portion of the query based on the determined quantity of records generated by the processing task. 2. The method of claim 1 , wherein the records to be processed are based on events stored in a data store, each event storing a portion of raw machine data associated with a timestamp. 3. The method of claim 1 , wherein the processing task is an extraction rule. 4. The method of claim 1 , wherein the processing task is a data transform. 5. The method of claim 1 , wherein the processing task is configured for execution by one or more worker nodes. 6. The method of claim 1 , wherein identifying the processing task comprises parsing the query to identify a command. 7. The method of claim 1 , wherein identifying the processing task comprises parsing the query to identify a command that generates more records than received. 8. The method of claim 1 , wherein the records to be processed correspond to records received from one or more indexers of a data intake and query system. 9. The method of claim 1 , wherein the records to be processed correspond to records generated by a preceding processing task. 10. The method of claim 1 , wherein determining the quantity of records generated comprises multiplying the quantity of records to be processed by the record generation estimate. 11. The method of claim 1 , wherein determining the quantity of records generated comprises identifying the record generation estimate from a plurality of record generation estimates based on a time range associated with the query and an identification of the processing task. 12. The method of claim 1 , wherein the record generation estimate is obtained from a lookup table that stores a plurality of record generation estimates, and wherein the lookup table stores a time range, processing task identifier, data sourcetype, and record generation ratio for each record generation estimate of the plurality of record generation estimates. 13. The method of claim 1 , wherein the at least a portion of the query corresponds to the processing task. 14. The method of claim 1 , wherein the at least a portion of the query corresponds to the portion of the query that is configured to be executed by one or more worker nodes of a data intake and query system. 15. The method of claim 1 , further comprising allocating the compute resources for the query based on the determined quantity of records generated by the processing task. 16. The method of claim 1 , further comprising allocating the compute resources for the at least a portion of the query, wherein allocating the compute resources comprises allocating the compute resources based the determined quantity of records generated by the processing task and a priority level assigned to the query. 17. The method of claim 1 , further comprising allocating the compute resources for the at least a portion of the query, wherein allocating the compute resources comprises allocating one or more worker nodes to execute a portion of the query based on the determined quantity of records generated by the processing task. 18. The method of claim 1 , further comprising allocating the compute resources for the at least a portion of the query, wherein allocating the compute resources comprises allocating one or more processors to one or more worker nodes to execute a portion of the query based on the determined quantity of records generated by the processing task. 19. The method of claim 1 , wherein the processing task is a first processing task, the quantity of records to be processed is a first quantity of records to be processed, the quantity of records generated is a first quantity of records generated, and the record generation estimate is a first record generation estimate, the method further comprising: identifying a second processing task of the query and a second quantity of records to be processed according to the second processing task; determining a second quantity of records generated by the second processing task based on the second quantity of records to be processed and a second record generation estimate; and allocating compute resources for the query based on the first quantity of records generated and the second quantity of records generated. 20. The method of claim 1 , wherein the processing task is a first processing task, the quantity of records to be processed is a first quantity of records to be processed, the quantity of records generated is a first quantity of records generated, and the record generation estimate is a first record generation estimate, the method further comprising: identifying a second processing task of the query and a second quantity of records to be processed according to the second processing task; determining a second quantity of records generated by the second processing task based on the second quantity of records to be processed and a second record generation estimate; and allocating compute resources for the query based on a larger of the first quantity of records generated and the second quantity of records generated. 21. The method of claim 1 , wherein the processing task is a first processing task, the quantity of records generated is a first quantity of records generated, and the record generation estimate is a first record generation estimate, the method further comprising: identifying a second processing task of the query; determining a second quantity of records generated by the second processing task based on the first quantity of records generated and a second record generation estimate; and allocating compute resources for the query based on the first quantity of records generated and the second quantity of records generated. 22. The method of claim 1 , further comprising estimating the processing time for the query. 23. The method of claim 1 , further comprising allocating compute resources for the at least a portion of the query and estimating the processing time for the query based on the determined quantity of records generated by the processing task and the compute resources allocated for the at least a portion of the query. 24. A computing system of a data intake and query system, the computing system comprising: memory; and one or more processing devices coupled to the memory and configured to: receive a query, wherein the query identifies a set of data and a manner of processing the set of data; identify a processing task of the query and a quantity of records to be processed according to the processing task; determine a quantity of records generated by the processing task based on the quantity of records to be processed and a record generation estimate; and at least one of: allocate compute resources for at least a portion of the query based on the determined quantity of records generated by the processing task; or estimate a processing time for
Iterative querying; Query formulation based on the results of a preceding query · CPC title
using directory or table look-up (use of a directory or look-up table in file systems G06F16/13) · CPC title
Management thereof · CPC title
Distributed queries · CPC title
of sub-queries or views · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.