Tracking and maintaining expression statistics across database queries
US-2017031967-A1 · Feb 2, 2017 · US
US12579110B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12579110-B2 |
| Application number | US-202117324874-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 19, 2021 |
| Priority date | Feb 24, 2021 |
| Publication date | Mar 17, 2026 |
| Grant date | Mar 17, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The present disclosure involves systems, software, and computer implemented methods for improved design and implementation of data access metrics for automated physical database design. An example method includes identifying a database workload for which index advisor access counters are to be tracked. Each SQL statement in the database workload is executed. For each SQL statement, attribute sets are determined for which a selection predicate filters a result for an SQL statement. An output cardinality of each selection predicate is determined. A logarithmic counter for an attribute set corresponding to the selection predicate is determined based on the output cardinality of the selection predicate. The determined logarithmic counter is incremented. Respective values for logarithmic counters of the determined attributes are provided to an index advisor. The index advisor determines attribute sets for which to propose an index based on the logarithmic counters of the respective attribute sets.
Opening claim text (preview).
What is claimed is: 1 . A computer-implemented method comprising: identifying a database workload comprising at least one query for which database column access counters are to be maintained; determining a set of all database columns referenced in the database workload by identifying all database column references that are included in the at least one query of the database workload; creating, based on the set of all database columns referenced in the database workload, database column access counters for the database workload that track counts of read accesses of data in respective database columns, including: creating a sequential access counter for each database column referenced in the database workload; and creating a random access counter for each database column referenced in the database workload; tracking, for each database column referenced in the database workload, memory access of data in the database column during execution of the database workload, including: incrementing the sequential access counter for a respective database column in response to determining that a database row that includes data of the database column is sequentially read; and incrementing the random access counter for a respective database column in response to determining that a database row that includes data of the database column is randomly read; providing respective values for the database column access counters for each database column referenced in the database workload to a data compression advisor; identifying, by the data compression advisor, at least one compression proposal rule for determining whether to propose data compression for a given database column, wherein each compression proposal rule is based on values of both the random access counter for the database column, the sequential access counter for the database column, and a rule threshold; determining, by the data compression advisor, for each database column referenced in the database workload, whether to propose data compression for the database column based on evaluating each compression proposal rule based on the respective values for the database column access counters for the database column and the rule threshold for the compression proposal rule; generating, by the data compression advisor, a recommendation to compress a first database column referenced in the database workload based on determining, during evaluation of a first compression proposal rule, that the rule threshold for the first compression proposal rule is satisfied; providing the recommendation to an automatic database tuner; and automatically implementing, by the automatic database tuner, data compression for the first database column based on the recommendation. 2 . The computer-implemented method of claim 1 , wherein determining whether to propose data compression for a first database column comprises: determining an estimated execution time for the database workload if the first database column is compressed; determining an estimated execution time for the database workload if the first database column is not compressed; and determining to propose data compression for the first database column in response to determining that the estimated execution time for the database workload if the first database column is compressed is less than the estimated execution time for the database workload if the first database column is not compressed. 3 . The computer-implemented method of claim 1 , wherein evaluating the first compression proposal rule when determining whether to propose data compression for the first database column comprises: comparing the value of the sequential access counter for the first database column to the value of the random access counter for the first database column; and determining whether to propose data compression for the first database column based on comparing the value of the sequential access counter for the first database column to the value of the random access counter for the first database column. 4 . The computer-implemented method of claim 3 , wherein evaluating the first compression proposal rule when determining whether to propose data compression for the first database column based on comparing the value of the sequential access counter for the first database column to the value of the random access counter for the first database column comprises determining whether the value of the sequential access counter for the first database column is substantially larger than the value of the random access counter for the first database column. 5 . The computer-implemented method of claim 1 , wherein the first compression proposal rule specifies that a value of a sequential access counter for a database column is substantially larger than a value of a corresponding random access counter for the database column when a ratio of the value of the sequential access counter for the database column to the value of the corresponding random access counter for the database column is more than a predetermined threshold. 6 . A system comprising: one or more computers; and a computer-readable medium coupled to the one or more computers having instructions stored thereon which, when executed by the one or more computers, cause the one or more computers to perform operations comprising: identifying a database workload comprising at least one query for which database column access counters are to be maintained; determining a set of all database columns referenced in the database workload by identifying all database column references that are included in the at least one query of the database workload; creating, based on the set of all database columns referenced in the database workload, database column access counters for the database workload that track counts of read accesses of data in respective database columns, including: creating a sequential access counter for each database column referenced in the database workload; and creating a random access counter for each database column referenced in the database workload; tracking, for each database column referenced in the database workload, memory access of data in the database column during execution of the database workload, including: incrementing the sequential access counter for a respective database column in response to determining that a database row that includes data of the database column is sequentially read; and incrementing the random access counter for a respective database column in response to determining that a database row that includes data of the database column is randomly read; providing respective values for the database column access counters for each database column referenced in the database workload to a data compression advisor; identifying, by the data compression advisor, at least one compression proposal rule for determining whether to propose data compression for a given database column, wherein each compression proposal rule is based on values of both the random access counter for the database column, the sequential access counter for the database column, and a rule threshold; determining, by the data compression advisor, for each database column referenced in the database workload, whether to propose data compression for the database column based on evaluating each compression proposal rule based on the respective values for the database column access counters for the database column and the rule threshold for the compression proposal rule; generating, by the data compression advisor, a recommendation to compress a first database column referenced in the database workload based on determining, during evaluation of a first compression proposal rule, that the rule threshold for the first compression proposal rule is satisfied;
Database tuning (G06F16/2282 takes precedence; database performance monitoring G06F11/3409) · CPC title
Selection of Compressor · CPC title
Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors · CPC title
with details for schema evolution support · CPC title
Query execution · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.