Reclustering of database tables using level information
US-10963443-B2 · Mar 30, 2021 · US
US11544244B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11544244-B2 |
| Application number | US-202217654296-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 10, 2022 |
| Priority date | Jul 17, 2018 |
| Publication date | Jan 3, 2023 |
| Grant date | Jan 3, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Disclosed herein are embodiments of systems and methods for selecting partitions for reclustering based on distribution of overlapping partitions. In an example, a database platform makes a determination to at least partially recluster a database table that includes data stored across a plurality of partitions. The database platform responsively selects a subset of the partitions. The selecting of the subset includes identifying a point on a domain of a clustering key that corresponds to a local maximum of overlapping partitions, and also includes selecting the subset from among a group of overlapping partitions. The group includes at least one partition that overlaps the identified point on the domain of the clustering key. Each partition in the selected subset is above a reduction goal of overlapping partitions. The database platform at least partially reclusters the selected subset based on the clustering key.
Opening claim text (preview).
What is claimed is: 1. A method performed by a database platform executing instructions on at least one hardware processor, the method comprising: making an incremental-reclustering determination to at least partially recluster a database table, the database table comprising table data stored across a plurality of partitions, the database table further comprising a clustering key; selecting, in response to making the incremental-reclustering determination, a subset of the plurality of partitions, the selecting of the subset comprising: identifying a point on a domain of the clustering key that corresponds to a local maximum number of overlapping partitions; and selecting the subset from among a group of overlapping partitions, the group of overlapping partitions including at least one partition that overlaps the identified point on the domain of the clustering key, each partition in the selected subset being above a reduction goal that is measured in number of overlapping partitions; and at least partially reclustering the selected subset based on the clustering key. 2. The method of claim 1 , wherein making the incremental-reclustering determination comprises making the incremental-reclustering determination based on a budget of one or more available computing resources. 3. The method of claim 1 , wherein the local maximum number of overlapping partitions is also a global maximum number of overlapping partitions on the domain of the clustering key. 4. The method of claim 1 , wherein the local maximum number of overlapping partitions is not a global maximum number of overlapping partitions on the domain of the clustering key. 5. The method of claim 1 , wherein selecting the subset comprises selecting partitions from among an uppermost subgroup of the group of overlapping partitions that are above the reduction goal. 6. The method of claim 1 , wherein selecting the subset comprises selecting partitions from among the group of overlapping partitions that are both above the reduction goal and that have the greatest widths among the partitions in the group of overlapping partitions that are above the reduction goal. 7. The method of claim 1 , wherein at least partially reclustering the selected subset based on the clustering key comprises reclustering the entire selected subset based on the clustering key. 8. The method of claim 7 , wherein reclustering the entire selected subset comprises distributing the entire selected subset among a plurality of workers to be reclustered. 9. The method of claim 1 , wherein the making of the incremental-reclustering determination to at least partially recluster the database table is based on determining that at least a threshold number of modifications have been made to the database table since a previous reclustering operation. 10. The method of claim 1 , wherein the making of the incremental-reclustering determination to at least partially recluster the database table is based on one or more clustering metrics of the database table. 11. A database platform comprising: at least one hardware processor; and one or more non-transitory computer readable storage media containing instructions that, when executed by the at least one hardware processor, cause the database platform to perform operations comprising: making an incremental-reclustering determination to at least partially recluster a database table, the database table comprising table data stored across a plurality of partitions, the database table further comprising a clustering key; selecting, in response to making the incremental-reclustering determination, a subset of the plurality of partitions, the selecting of the subset comprising: identifying a point on a domain of the clustering key that corresponds to a local maximum number of overlapping partitions; and selecting the subset from among a group of overlapping partitions, the group of overlapping partitions including at least one partition that overlaps the identified point on the domain of the clustering key, each partition in the selected subset being above a reduction goal that is measured in number of overlapping partitions; and at least partially reclustering the selected subset based on the clustering key. 12. The database platform of claim 11 , wherein making the incremental-reclustering determination comprises making the incremental-reclustering determination based on a budget of one or more available computing resources. 13. The database platform of claim 11 , wherein the local maximum number of overlapping partitions is also a global maximum number of overlapping partitions on the domain of the clustering key. 14. The database platform of claim 11 , wherein the local maximum number of overlapping partitions is not a global maximum number of overlapping partitions on the domain of the clustering key. 15. The database platform of claim 11 , wherein selecting the subset comprises selecting partitions from among an uppermost subgroup of the group of overlapping partitions that are above the reduction goal. 16. The database platform of claim 11 , wherein selecting the subset comprises selecting partitions from among the group of overlapping partitions that are both above the reduction goal and that have the greatest widths among the partitions in the group of overlapping partitions that are above the reduction goal. 17. The database platform of claim 11 , wherein at least partially reclustering the selected subset based on the clustering key comprises reclustering the entire selected subset based on the clustering key. 18. The database platform of claim 17 , wherein reclustering the entire selected subset comprises distributing the entire selected subset among a plurality of workers to be reclustered. 19. The database platform of claim 11 , wherein the making of the incremental-reclustering determination to at least partially recluster the database table is based on determining that at least a threshold number of modifications have been made to the database table since a previous reclustering operation. 20. The database platform of claim 11 , wherein the making of the incremental-reclustering determination to at least partially recluster the database table is based on one or more clustering metrics of the database table. 21. One or more non-transitory computer readable storage media containing instructions that, when executed by at least one hardware processor of a database platform, cause the database platform to perform operations comprising: making an incremental-reclustering determination to at least partially recluster a database table, the database table comprising table data stored across a plurality of partitions, the database table further comprising a clustering key; selecting, in response to making the incremental-reclustering determination, a subset of the plurality of partitions, the selecting of the subset comprising: identifying a point on a domain of the clustering key that corresponds to a local maximum number of overlapping partitions; and selecting the subset from among a group of overlapping partitions, the group of overlapping partitions including at least one partition that overlaps the identified point on the domain of the clustering key, each partition in the selected subset being above a reduction goal that is measured in number of overlapping partitions; and at least partially reclustering the selected subset based on the clustering key. 22. The one or more non-transitory computer readabl
Tablespace storage structures; Management thereof · CPC title
Indexing; Data structures therefor; Storage structures · CPC title
Clustering or classification · CPC title
Sorting, i.e. grouping record carriers in numerical or other ordered sequence according to the classification of at least some of the information they carry (by merging two or more sets of carriers in ordered sequence G06F7/16) · CPC title
Database tuning (G06F16/2282 takes precedence; database performance monitoring G06F11/3409) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.