Efficient database query evaluation

US11971856B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11971856-B2
Application numberUS-202016779366-A
CountryUS
Kind codeB2
Filing dateJan 31, 2020
Priority dateJan 31, 2020
Publication dateApr 30, 2024
Grant dateApr 30, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Data in a micro-partition of a table is stored in a compressed form. In response to a database query on the table comprising a filter, the portion of the data on which the filter operates is decompressed, without decompressing other portions of the data. Using the filter on the decompressed portion of the data, the portions of the data that are responsive to the filter are determined and decompressed. The responsive data is returned in response to the database query. When a query is run on a table that is compressed using dictionary compression, the uncompressed data may be returned along with the dictionary look-up values. The recipient of the data may use the dictionary look-up values for memoization, reducing the amount of computation required to process the returned data.

First claim

Opening claim text (preview).

What is claimed is: 1. A system comprising: a memory that stores instructions; and one or more processors configured by the instructions to perform operations comprising: accessing a first operation for a table of a database the first operation for the table comprising a filter on a first column of the table, the table being stored in a plurality of micro-partitions, a first micro-partition of the plurality of micro-partitions of the table being compressed according to a first compression algorithm and a second micro-partition of the plurality of micro-partitions of the table being compressed according to a second compression algorithm; receiving a query comprising multiple filters comprising the filter; selecting whether to apply the filter within the database or by a central server, the selecting whether to apply the filter being performed on a filter-by-filter basis for each of the multiple filters, wherein a first filter is applied by the database and a second filter is applied by the central server; in response to selecting to apply the filter within the database, transmitting the query comprising the filter to the database; in response to selecting to apply the filter by the central server, transmitting a request for unfiltered data from the database and applying the filter to the unfiltered data received from the database, the request excluding the filter; decompressing, in the first micro-partition, the first column of the table excluding decompressing other columns of the table in the first micro-partition; decompressing, based on the filter on the first column, rows of the first micro-partition of the table that contains data responsive to the filter excluding decompressing other rows of the first micro-partition of the table that contains data not responsive to the filter; providing, in response to the first operation for the table, the decompressed rows of the first micro-partition; accessing a second operation for the table, the second operation comprising determining a second computation result on the first column of the table; computing, for a first entry in the rows of the first micro-partition, a first computation result on a first value of the first column of the first entry; storing the first computation result for the first entry in conjunction with a first compressed value for the first entry; and based on a second compressed value for a second entry of the table being identical to the first compressed value for the first entry, storing the first computation result as the second computation result instead of computing the second computation result using the second compressed value for the second entry. 2. The system of claim 1 , wherein each micro-partition of the plurality of micro-partitions is a file on a file system, wherein the first micro-partition is compressed using dictionary compression and the second micro-partition is compressed using run-length encoding. 3. The system of claim 1 , wherein the filter comprises a value for the first column, wherein a first set of columns of the first micro-partition is compressed according to the first compression algorithm and a second set of columns of the first micro-partition is compressed according to a third compression algorithm. 4. The system of claim 1 , wherein decompressing the first column comprises: accessing a compressed value for each entry in the first micro-partition for the first column; accessing a dictionary that maps compressed values to uncompressed values; and using the dictionary, determining an uncompressed value for each compressed value of the entries in the first micro-partition. 5. The system of claim 4 , wherein the operations further comprise: providing, in response to the operation for the table, the compressed value for the first column for each entry in the decompressed rows of the first micro-partition. 6. The system of claim 1 , wherein the operations further comprise: performing an aggregation operation on the table by performing operations comprising: aggregating entries in the table to create a first aggregated data structure comprising aggregated entries; based on a predetermined threshold and a number of entries in the first aggregated data structure: transferring the aggregated entries from the first aggregated data structure to a second aggregated data structure; and clearing the aggregated entries in the first aggregated data structure; and resuming aggregating the entries in the table in the first aggregated data structure. 7. The system of claim 6 , wherein decompressing the first column comprises: accessing a compressed value for each entry in the first micro-partition for the first column; accessing a dictionary that maps compressed values to uncompressed values; and using the dictionary, determining the uncompressed value for each compressed value of the entries in the first micro-partition; the operations further comprise: providing, in response to the operation for the table, the compressed value for the first column for each entry in the decompressed rows of the first micro-partition; and wherein the aggregating of the entries in the table to create the first aggregated data structure determines to combine a first entry with a second entry based on the compressed value of the first entry being identical to the compressed value of the second entry. 8. The system of claim 1 , wherein: the operations further comprise: decompressing a first column in the second micro-partition excluding decompressing other columns of the table in the second micro-partition; and decompressing, based on the filter on the first column, rows of the second micro-partition containing data responsive to the filter excluding decompressing other rows of the second micro-partition containing data not responsive to the filter; and combining the decompressed rows of the micro-partition with the decompressed rows of the second micro-partition for provision in response to the operation for the table. 9. A non-transitory machine-readable medium that stores instructions that, when executed by one or more processors, cause the one or more processors to perform operations comprising: accessing a first operation for a table of a database the first operation for the table comprising a filter on a first column of the table, the table being stored in a plurality of micro-partitions, a first micro-partition of the plurality of micro-partitions of the table being compressed according to a first compression algorithm and a second micro-partition of the plurality of micro-partitions of the table being compressed according to a second compression algorithm; receiving a query comprising multiple filters comprising the filter; selecting whether to apply the filter within the database or by a central server, the selecting whether to apply the filter being performed on a filter-by-filter basis for each of the multiple filters, wherein a first filter is applied by the database and a second filter is applied by the central server; in response to selecting to apply the filter within the database, transmitting the query comprising the filter to the database; in response to selecting to apply the filter by the central server, transmitting a request for unfiltered data from the database and applying the filter to the unfiltered data received from the database, the request excluding the filter; decompressing, in the first micro-partition, the first column of the table excluding decompressing other columns of the table in the first micro-partition; decompressing, based on the filter on the first column, rows of the first micro-partition of the table that contains data responsive to the filter excluding decompressing other rows o

Assignees

Inventors

Classifications

  • using compression, e.g. sparse files · CPC title

  • G06F16/221Primary

    Column-oriented storage; Management thereof · CPC title

  • Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor · CPC title

  • Data partitioning, e.g. horizontal or vertical partitioning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11971856B2 cover?
Data in a micro-partition of a table is stored in a compressed form. In response to a database query on the table comprising a filter, the portion of the data on which the filter operates is decompressed, without decompressing other portions of the data. Using the filter on the decompressed portion of the data, the portions of the data that are responsive to the filter are determined and decomp…
Who is the assignee on this patent?
Snowflake Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/1744. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 30 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).