In-memory column-level multi-versioned global dictionary for in-memory databases
US-2017109406-A1 · Apr 20, 2017 · US
US11294816B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11294816-B2 |
| Application number | US-201715397714-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 3, 2017 |
| Priority date | Oct 15, 2015 |
| Publication date | Apr 5, 2022 |
| Grant date | Apr 5, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques are described herein for reducing the number of redundant evaluations that occur when an expression is evaluated against an encoded column vector by caching results of expression evaluations. When executing a query that includes an expression that references columns for which dictionary-encoded column vectors exist, the database server performs a cost-based analysis to determine which expressions (or sub-expressions) would benefit from caching the expression's evaluation result. For each such expression, the database server performs the necessary computations and caches the results for each of the possible distinct input values. When evaluating an expression for a row with a particular set of input codes, a look-up is performed based on the input code combination to retrieve the pre-computed results of that evaluation from the cache.
Opening claim text (preview).
What is claimed is: 1. A method comprising: maintaining a table on persistent storage; wherein the table includes a first column; wherein the table comprises a particular row that includes a first value for the first column; based on a dictionary, creating, in volatile memory, a dictionary-encoded column vector that contains codes that correspond to values from a range of rows of the first column; wherein the dictionary-encoded column vector stores a first code in a particular entry that corresponds to the particular row; receiving a query that includes a first expression that references the first column; creating in volatile memory, for the first expression, a data structure; evaluating the first expression for the first code by: reading the first code from the particular entry; using the dictionary to determine that the first code corresponds to the first value; and computing a first computed result of the first expression by evaluating the first expression based, at least in part, on the first value; using the first code to index into the data structure to locate a first index entry, in the data structure, that is associated with the first code; storing the first computed result of the first expression in the first index entry; during execution of the query, evaluating the first expression for a second row of the table that is associated with a second entry, in the dictionary-encoded column vector, that stores a second instance of the first code by performing the steps of: reading the second instance of the first code from the second entry, of the dictionary-encoded column vector, that corresponds to the second row of the table; using the first code obtained from the second entry of the dictionary-encoded column vector to index into the data structure to locate the first index entry that stores the first computed result; and reading the first computed result from the first index entry; during execution of the query and after evaluating the first expression for the second row of the table, evaluating the first expression for a third row of the table that is associated with a third entry, in the dictionary-encoded column vector, that stores a second code, by performing the steps of: determining that the second code represents an input for which pre-computed results have not yet been generated for the first expression; in response to determining that the second code represents an input for which pre-computed results have not yet been generated for the first expression: evaluating the first expression for the second code by: using the dictionary to determine that the second code corresponds to a second value, and computing a second computed result of the first expression by evaluating the first expression based, at least in part, on the second value; and storing the second computed result of the first expression in a second index entry associated with the second code; after execution of the query, returning results of the query; wherein the method is performed by one or more computing devices. 2. The method of claim 1 wherein: the first expression references N columns; and the data structure is an N-dimensional array. 3. The method of claim 1 further comprising: performing a test on each expression, of a plurality of expressions contained in the query, to determine which expressions of the plurality of expressions qualify to be cached-result expressions; and creating the data structure in response to determining that the first expression qualifies to be a cached-result expression. 4. The method of claim 3 wherein, for the first expression, the test is based, at least in part, on cardinality of the first column for the range of rows. 5. The method of claim 3 , wherein, for the first expression, the test is based, at least in part, on whether the first expression is a sub-expression that occurs more than once in the query. 6. The method of claim 3 wherein: the first column is one of a plurality of columns referenced in the first expression; and each column, of the plurality of columns, has a corresponding dictionary with which values of said each column have been encoded; and for the first expression, the test is based, at least in part, on a number of all possible combinations of codes from the dictionaries that correspond to the plurality of columns. 7. The method of claim 3 wherein: the first column is one of a plurality of columns referenced in the first expression; and for the first expression, the test is based, at least in part, on a number of distinct input combinations, of the plurality of columns referenced in the first expression, actually present in the table for the range of rows. 8. The method of claim 3 wherein: the first column is one of a plurality of columns referenced in the first expression; the query has a WHERE condition; and for the first expression, the test is based, at least in part, on a number of distinct input combinations, for the first expression, present in the actual rows, within the range of rows, that satisfy the WHERE condition. 9. The method of claim 1 wherein the first expression references a second column of the table that has a corresponding column vector that is not dictionary encoded. 10. The method of claim 1 wherein the first expression references a second column of the table that is not mirrored in volatile memory. 11. A method comprising: maintaining a table on persistent storage; wherein the table includes a first column; wherein the table comprises a particular row that includes a first value for the first column; based on a dictionary, creating, in volatile memory, a dictionary-encoded column vector that contains codes that correspond to values from a range of rows of the first column; wherein the dictionary-encoded column vector stores a first code in a particular entry that corresponds to the particular row; receiving a query that includes a first expression that references the first column; creating in volatile memory, for the first expression, a first structure; evaluating the first expression for the first code by: using the dictionary to determine that the first code in the particular entry corresponds to the first value; and computing a first computed result of the first expression by evaluating the first expression based, at least in part, on the first value; using the first code to index into the first structure to locate a first index entry, in the first structure, that is associated with the first code; storing the first computed result of the first expression in the first index entry; during execution of the query, evaluating the first expression for a second row of the table that is associated with a second entry, in the dictionary-encoded column vector, that stores a second instance of the first code by performing the steps of: reading the second instance of the first code from the second entry, of the dictionary-encoded column vector, that corresponds to the second row of the table, using the first code obtained from the second entry of the dictionary-encoded column vector to index into the first structure to locate the first index entry that stores the first computed result, and reading the first computed result from the first index entry; wherein the first expression is a subexpression of a second expression of the query; creating in volatile memory, for the second expression, a second structure; evaluating the second expression for the first code and one or more input values by: using the first code to index into the first structure to locate the first index entry; and reading the first computed result from the first index entry in
Query translation · CPC title
Details of cache memory · CPC title
with dedicated cache, e.g. instruction or stack · CPC title
Sorting, i.e. extracting data from one or more carriers, rearranging the data in numerical or other ordered sequence, and rerecording the sorted data on the original carrier or on a different carrier or set of carriers {sorting methods in general}(G06F7/36 takes precedence) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.