Systems and methods for overload protection for real-time computing engines
US-2019121676-A1 · Apr 25, 2019 · US
US11144550B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11144550-B2 |
| Application number | US-202016857817-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 24, 2020 |
| Priority date | Sep 25, 2019 |
| Publication date | Oct 12, 2021 |
| Grant date | Oct 12, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The subject technology receives a query plan, the query plan comprising a set of query operations, the set of query operations including at least one aggregation and a join operation, the join operation including a build side and a probe side. The subject technology inserts an aggregation operator below the probe side of the join operation. The subject technology causes the build side of the join operation to generate a hash table. The subject technology causes the build side of the join operation to generate a bloom filter based at least in part on the hash table and provide information, corresponding to properties of the build side, to a bloom filter. Based at least in part on the information, the subject technology determines at least one property of the join operation to determine whether to switch the aggregation operator to a pass through mode.
Opening claim text (preview).
What is claimed is: 1. A method comprising: receiving a query plan, the query plan comprising a set of query operations, the set of query operations including at least one aggregation and a join operation, the join operation including a build side and a probe side, each of the query operation in the set of query operations included as different nodes in a tree structure corresponding to the query plan, the build side of the join operation providing information comprising a build side join key cardinality and a number of distinct values of a join key; providing information, corresponding to properties of the build side, to a bloom filter, the bloom filter utilized to pass the information to an aggregation operator; based at least in part on the information from the bloom filter, determining, during executing of the query plan, at least one property of the join operation to determine whether to switch the aggregation operator to a pass through mode, the at least one property comprising at least a reduction rate, the reduction rate being based on a ratio of a first number of records that a duplicate-removal operation has removed to a second number of records that the duplicate-removal operation has ingested; switching, in response to the reduction rate being below a threshold value, the aggregation operator to the pass through mode during runtime of the query plan, the switching facilitating a reduction in utilization of computing resources by forgoing performing the aggregation operator; and, while the aggregation operator is in the pass through mode, an input stream of data goes through the aggregation operator without being analyzed, and the input stream of data matches an output stream of data flowing out of the aggregation operator; and after switching the aggregation operator to the pass through mode, sending new data received from the bloom filter to a second query operation included in the query plan, the new data being pass through the aggregation operator without performing the aggregation operation operator, the second query operation utilizing the new data to perform a particular operation of the query plan, the second query operation being above the aggregation operator in the tree structure corresponding to the query plan, the second query operation corresponding to a first node and the aggregation operator corresponding to a second node, the second query operation comprising a second aggregation operator different than the aggregation operator. 2. The method of claim 1 , wherein determining the at least one property of the join operation comprises: causing the aggregation operator to determine an explosiveness of the join operation based on the information from the build side of the join operation. 3. The method of claim 2 , wherein the explosiveness of the join operation is based at least in part on a number of distinct values of a join key as indicated by the bloom filter. 4. The method of claim 2 , further comprising: defining a threshold explosiveness for the aggregation operator such that the aggregation operator turns off at runtime in response to the join operation exceeding the threshold explosiveness. 5. The method of claim 1 , further comprising: inserting multiple aggregation operators within the query plan, wherein each of the multiple aggregation operators is configured to automatically turn off at runtime in response to determining an explosiveness of the join operation exceeds a threshold explosiveness or fails to meet a threshold reduction rate for removing duplicate values. 6. A system comprising: at least one processor; and a memory device including instructions, which when executed by the at least one processor, cause the at least one processor to perform operations comprising: receiving a query plan, the query plan comprising a set of query operations, the set of query operations including at least one aggregation and a join operation, the join operation including a build side and a probe side, each of the query operation in the set of query operations included as different nodes in a tree structure corresponding to the query plan, the build side of the join operation providing information comprising a build side join key cardinality and a number of distinct values of a loin key; providing information, corresponding to properties of the build side, to a bloom filter, the bloom filter utilized to pass the information to an aggregation operator; based at least in part on the information from the bloom filter, determining, during executing of the query plan, at least one property of the join operation to determine whether to switch the aggregation operator to a pass through mode, the at least one property comprising at least a reduction rate, the reduction rate being based on a ratio of a first number of records that a duplicate-removal operation has removed to a second number of records that the duplicate-removal operation has ingested; switching, in response to the reduction rate being below a threshold value, the aggregation operator to the pass through mode during runtime of the query plan, the switching facilitating a reduction in utilization of computing resources by forgoing performing the aggregation operator; and, while the aggregation operator is in the pass through mode, an input stream of data goes through the aggregation operator without being analyzed, and the input stream of data matches an output stream of data flowing out of the aggregation operator; and after switching the aggregation operator to the pass through mode, sending new data received from the bloom filter to a second query operation included in the query plan, the new data being pass through the aggregation operator without performing the aggregation operation operator, the second query operation utilizing the new data to perform a particular operation of the query plan, the second query operation being above the aggregation operator in the tree structure corresponding to the query plan, the second query operation corresponding to a first node and the aggregation operator corresponding to a second node, the second query operation comprising a second aggregation operator different than the aggregation operator. 7. The system of claim 6 , wherein determining the at least one property of the join operation further causes the at least one processor to perform operations further comprising: inserting an aggregation operator below the probe side of the join operation; causing the build side of the join operation to generate a hash table; causing the build side of the join operation to generate a bloom filter based at least in part on the hash table; and causing the aggregation operator to determine an explosiveness of the join operation based on the information from the build side of the join operation. 8. The system of claim 7 , wherein the explosiveness of the join operation is based at least in part on a number of distinct values of a join key as indicated by the bloom filter. 9. The system of claim 7 , wherein the memory device includes further instructions, which when executed by the at least one processor, further cause the at least one processor to perform operations comprising: defining a threshold explosiveness for the aggregation operator such that the aggregation operator turns off at runtime in response to the join operation exceeding the threshold explosiveness. 10. The system of claim 7 , wherein the memory device includes further instructions, which when executed by the at least one processor, further cause the at least one processor to perform operations comprising: causing the aggregation operator to determine a locally observed reduction rate of the aggregation operator.
Join operations · CPC title
Aggregation; Duplicate elimination · CPC title
Selectivity estimation or determination · CPC title
Plan optimisation · CPC title
of operators · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.