Value range synopsis in column-organized analytical databases
US-2017323003-A1 · Nov 9, 2017 · US
US2023394009A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2023394009-A1 |
| Application number | US-202318448512-A |
| Country | US |
| Kind code | A1 |
| Filing date | Aug 11, 2023 |
| Priority date | Jul 14, 2016 |
| Publication date | Dec 7, 2023 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A system and method for pruning data based on metadata. The method may include receiving a query that includes a plurality of predicates and identifying one or more applicable files including database data satisfying at least one of the plurality of predicates. The identifying the one or more applicable files including reading metadata stored in a metadata store that is separate from the database data. The method further includes pruning inapplicable files including database data that does not satisfy at least one of the plurality of predicates to create a reduced set of files and reading the reduced set of files to execute the query.
Opening claim text (preview).
What is claimed is: 1 . A method comprising: managing database data into a plurality of partitions of subsets of the database data, wherein the subsets of the database data correspond to sets of files; receiving a query comprising a plurality of clauses; comparing the plurality of clauses in the query to metadata of the set of files of the database data without accessing the set of files; creating a pruned set of partitions by removing, from the plurality of partitions, part of the sets of files in one or more of the subsets of the database data based on the comparison; and using the pruned set of partitions to execute the query. 2 . The method of claim 1 , wherein, for each of the plurality of partitions, metadata stored in the metadata includes metadata corresponding to multiple columns of the database data and a plurality different types of partition ranges that characterize at least one of the plurality of partitions. 3 . The method of claim 1 , wherein creating the pruned set of partitions comprises: identifying the partition to be pruned without accessing the data comprising the partition. 4 . The method of claim 3 , wherein identifying the partitions to be pruned without accessing the data comprising the partition comprises: reading the metadata pertaining to each of the partitions, the metadata stored in the metadata store. 5 . The method of claim 2 , wherein the metadata includes a minimum and maximum value for each of the multiple columns of the database data and each of the plurality of partitions. 6 . The method of claim 1 , wherein each partition of the plurality of partitions comprises a discrete selection of the database data. 7 . The method of claim 6 , further comprising: generating the metadata comprising information for each partition; and storing the metadata in the metadata store separate from the partition. 8 . The method of claim 1 , further comprising: identifying, from the plurality of partitions, one or more of the subset of the database that do not satisfy a condition of at least one of the plurality of clauses to create a pruned set of partitions, and wherein reading the metadata comprises: determining, based on the metadata, a range of one or more values of database data stored in one of the partitions; and determining whether any data within the range of one or more values of database data stored in the one partition satisfies the condition of at least one of the plurality of clauses. 9 . The method of claim 8 , wherein the determining whether each of the plurality of partitions comprises: determining that a certain partition satisfies the condition of at least one of the plurality of clauses is based on at least a determination that some data within the range of one or more values stored in the certain partition satisfies the condition of at least one of the plurality of clauses. 10 . The method of claim 9 , wherein identifying the one or more of the subset of the database that do not satisfy the condition comprises identifying a certain file as being an applicable file in response to a determination that some data within the range of one or more values stored in the certain file satisfies the condition of at least one of the plurality of clauses. 11 . A system comprising: a metadata store to store metadata; and a processor, operatively coupled with the metadata store, configured to: manage database data into a plurality of partitions of subsets of the database data, wherein the subsets of the database data correspond to sets of files; receive a query comprising a plurality of clauses; compare the plurality of clauses in the query to metadata of the set of files of the database data without accessing the set of files; create a pruned set of partitions by removing, from the plurality of partitions, part of the sets of files in one or more of the subsets of the database data based on the comparison; and use the pruned set of partitions to execute the query. 12 . The system of claim 11 , wherein, for each of the plurality of partitions, metadata stored in the metadata includes metadata corresponding to multiple columns of the database data and a plurality different types of partition ranges that characterize at least one of the plurality of partitions. 13 . The system of claim 11 , wherein the processor is configured to create the pruned set of partitions by: identifying the partition to be pruned without accessing the data comprising the partition. 14 . The system of claim 13 , wherein identifying the partitions to be pruned without accessing the data comprising the partition comprises: reading the metadata pertaining to each of the partitions, the metadata stored in the metadata store. 15 . The system of claim 12 , wherein the metadata includes a minimum and maximum value for each of the multiple columns of the database data and each of the plurality of partitions. 16 . The system of claim 11 , wherein each partition of the plurality of partitions comprises a discrete selection of the database data. 17 . The system of claim 16 , wherein the processor is further configured to: generate the metadata comprising information for each partition; and store the metadata in the metadata store separate from the partition. 18 . The system of claim 11 , wherein the processor is further configured to: identify, from the plurality of partitions, one or more of the subset of the database that do not satisfy a condition of at least one of the plurality of clauses to create a pruned set of partitions, and wherein reading the metadata comprises: determine, based on the metadata, a range of one or more values of database data stored in one of the partitions; and determine whether any data within the range of one or more values of database data stored in the one partition satisfies the condition of at least one of the plurality of clauses. 19 . A non-transitory computer readable storage media, programmable to execute instructions that, when executed by a processor, cause the processor to: manage database data into a plurality of partitions of subsets of the database data, wherein the subsets of the database data correspond to sets of files; receive a query comprising a plurality of clauses; compare the plurality of clauses in the query to metadata of the set of files of the database data without accessing the set of files; create a pruned set of partitions by removing, from the plurality of partitions, part of the sets of files in one or more of the subsets of the database data based on the comparison; and use the pruned set of partitions to execute the query. 20 . The non-transitory computer readable storage media of claim 19 , wherein, for each of the plurality of partitions, metadata stored in the metadata includes metadata corresponding to multiple columns of the database data and a plurality different types of partition ranges that characterize at least one of the plurality of partitions.
Delete operations (erasing in storage systems G06F3/0652) · CPC title
Indexing; Data structures therefor; Storage structures · CPC title
Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors · CPC title
Search customisation based on user profiles and personalisation · CPC title
Join order optimisation · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.