What technology area does this patent fall under?

Primary CPC classification G06F16/24542. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Oct 12 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Pipeline level optimization of aggregation operators in a query plan during runtime

US11144550B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11144550-B2
Application number	US-202016857817-A
Country	US
Kind code	B2
Filing date	Apr 24, 2020
Priority date	Sep 25, 2019
Publication date	Oct 12, 2021
Grant date	Oct 12, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The subject technology receives a query plan, the query plan comprising a set of query operations, the set of query operations including at least one aggregation and a join operation, the join operation including a build side and a probe side. The subject technology inserts an aggregation operator below the probe side of the join operation. The subject technology causes the build side of the join operation to generate a hash table. The subject technology causes the build side of the join operation to generate a bloom filter based at least in part on the hash table and provide information, corresponding to properties of the build side, to a bloom filter. Based at least in part on the information, the subject technology determines at least one property of the join operation to determine whether to switch the aggregation operator to a pass through mode.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving a query plan, the query plan comprising a set of query operations, the set of query operations including at least one aggregation and a join operation, the join operation including a build side and a probe side, each of the query operation in the set of query operations included as different nodes in a tree structure corresponding to the query plan, the build side of the join operation providing information comprising a build side join key cardinality and a number of distinct values of a join key; providing information, corresponding to properties of the build side, to a bloom filter, the bloom filter utilized to pass the information to an aggregation operator; based at least in part on the information from the bloom filter, determining, during executing of the query plan, at least one property of the join operation to determine whether to switch the aggregation operator to a pass through mode, the at least one property comprising at least a reduction rate, the reduction rate being based on a ratio of a first number of records that a duplicate-removal operation has removed to a second number of records that the duplicate-removal operation has ingested; switching, in response to the reduction rate being below a threshold value, the aggregation operator to the pass through mode during runtime of the query plan, the switching facilitating a reduction in utilization of computing resources by forgoing performing the aggregation operator; and, while the aggregation operator is in the pass through mode, an input stream of data goes through the aggregation operator without being analyzed, and the input stream of data matches an output stream of data flowing out of the aggregation operator; and after switching the aggregation operator to the pass through mode, sending new data received from the bloom filter to a second query operation included in the query plan, the new data being pass through the aggregation operator without performing the aggregation operation operator, the second query operation utilizing the new data to perform a particular operation of the query plan, the second query operation being above the aggregation operator in the tree structure corresponding to the query plan, the second query operation corresponding to a first node and the aggregation operator corresponding to a second node, the second query operation comprising a second aggregation operator different than the aggregation operator. 2. The method of claim 1 , wherein determining the at least one property of the join operation comprises: causing the aggregation operator to determine an explosiveness of the join operation based on the information from the build side of the join operation. 3. The method of claim 2 , wherein the explosiveness of the join operation is based at least in part on a number of distinct values of a join key as indicated by the bloom filter. 4. The method of claim 2 , further comprising: defining a threshold explosiveness for the aggregation operator such that the aggregation operator turns off at runtime in response to the join operation exceeding the threshold explosiveness. 5. The method of claim 1 , further comprising: inserting multiple aggregation operators within the query plan, wherein each of the multiple aggregation operators is configured to automatically turn off at runtime in response to determining an explosiveness of the join operation exceeds a threshold explosiveness or fails to meet a threshold reduction rate for removing duplicate values. 6. A system comprising: at least one processor; and a memory device including instructions, which when executed by the at least one processor, cause the at least one processor to perform operations comprising: receiving a query plan, the query plan comprising a set of query operations, the set of query operations including at least one aggregation and a join operation, the join operation including a build side and a probe side, each of the query operation in the set of query operations included as different nodes in a tree structure corresponding to the query plan, the build side of the join operation providing information comprising a build side join key cardinality and a number of distinct values of a loin key; providing information, corresponding to properties of the build side, to a bloom filter, the bloom filter utilized to pass the information to an aggregation operator; based at least in part on the information from the bloom filter, determining, during executing of the query plan, at least one property of the join operation to determine whether to switch the aggregation operator to a pass through mode, the at least one property comprising at least a reduction rate, the reduction rate being based on a ratio of a first number of records that a duplicate-removal operation has removed to a second number of records that the duplicate-removal operation has ingested; switching, in response to the reduction rate being below a threshold value, the aggregation operator to the pass through mode during runtime of the query plan, the switching facilitating a reduction in utilization of computing resources by forgoing performing the aggregation operator; and, while the aggregation operator is in the pass through mode, an input stream of data goes through the aggregation operator without being analyzed, and the input stream of data matches an output stream of data flowing out of the aggregation operator; and after switching the aggregation operator to the pass through mode, sending new data received from the bloom filter to a second query operation included in the query plan, the new data being pass through the aggregation operator without performing the aggregation operation operator, the second query operation utilizing the new data to perform a particular operation of the query plan, the second query operation being above the aggregation operator in the tree structure corresponding to the query plan, the second query operation corresponding to a first node and the aggregation operator corresponding to a second node, the second query operation comprising a second aggregation operator different than the aggregation operator. 7. The system of claim 6 , wherein determining the at least one property of the join operation further causes the at least one processor to perform operations further comprising: inserting an aggregation operator below the probe side of the join operation; causing the build side of the join operation to generate a hash table; causing the build side of the join operation to generate a bloom filter based at least in part on the hash table; and causing the aggregation operator to determine an explosiveness of the join operation based on the information from the build side of the join operation. 8. The system of claim 7 , wherein the explosiveness of the join operation is based at least in part on a number of distinct values of a join key as indicated by the bloom filter. 9. The system of claim 7 , wherein the memory device includes further instructions, which when executed by the at least one processor, further cause the at least one processor to perform operations comprising: defining a threshold explosiveness for the aggregation operator such that the aggregation operator turns off at runtime in response to the join operation exceeding the threshold explosiveness. 10. The system of claim 7 , wherein the memory device includes further instructions, which when executed by the at least one processor, further cause the at least one processor to perform operations comprising: causing the aggregation operator to determine a locally observed reduction rate of the aggregation operator.

Assignees

Snowflake Inc

Inventors

Classifications

G06F16/2456
Join operations · CPC title
G06F16/24556
Aggregation; Duplicate elimination · CPC title
G06F16/24545
Selectivity estimation or determination · CPC title
G06F16/24542Primary
Plan optimisation · CPC title
G06F16/24537
of operators · CPC title

Patent family

Related publications grouped by family.

View patent family 74881891

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11144550B2 cover?: The subject technology receives a query plan, the query plan comprising a set of query operations, the set of query operations including at least one aggregation and a join operation, the join operation including a build side and a probe side. The subject technology inserts an aggregation operator below the probe side of the join operation. The subject technology causes the build side of the jo…
Who is the assignee on this patent?: Snowflake Inc
What technology area does this patent fall under?: Primary CPC classification G06F16/24542. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Oct 12 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).