Unified data management for database systems
US-9892150-B2 · Feb 13, 2018 · US
US10169361B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10169361-B2 |
| Application number | US-201514942772-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 16, 2015 |
| Priority date | Nov 16, 2015 |
| Publication date | Jan 1, 2019 |
| Grant date | Jan 1, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Disclosed is a computer-implemented method of compressing data in a columnar database comprising at least one column partitioned into a plurality of partitions including at least one empty partition and a plurality of filled partitions each comprising data entries associated with a set of parameters having parameter values relevant to the recurrence frequency of the data entry in the partition, the data entries being compressed in accordance with a compression dictionary based on the respective recurrence frequencies of the data entries in the filled partition. The computer-implemented method comprises receiving forecasted parameter values for the set of parameters for an expected set of data entries to be stored in an empty partition of the column; predicting a recurrence frequency of the data entries in the expected set using the forecasted parameter values by evaluating the respective compression dictionaries of the filled partitions with a machine learning algorithm; generating a predictive compression dictionary for the expected set of data entries based on the predicted recurrence frequency of the data entries in the expected set; receiving the expected set of data entries; and compressing at least part of the received expected set of data entries using the predictive compression dictionary. A computer program product and a computer system for implementing such a method are also disclosed.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method of compressing data in a columnar database comprising at least one column partitioned into a plurality of partitions including at least one empty partition and a plurality of filled partitions each comprising data entries associated with a set of parameters having parameter values relevant to a recurrence frequency of the data entry in the partition, the data entries being compressed in accordance with a compression dictionary based on respective recurrence frequencies of the data entries in the filled partition, the computer-implemented method comprising: receiving forecasted parameter values for the set of parameters for an expected set of data entries to be stored in an empty partition of the column; predicting a recurrence frequency of the data entries in the expected set using the forecasted parameter values by evaluating data entry ranking histories associated with the respective compression dictionaries of the filled partitions with a machine learning algorithm, wherein: evaluating the data entry ranking histories associated with the respective compression dictionaries of the filled partitions with the machine learning algorithm comprises detecting a correlation between patterns in the parameter values associated with the respective compression dictionaries and the associated data entry ranking histories; predicting the recurrence frequency of the data entries in the expected set further comprises detecting further patterns in the forecasted parameter values and comparing the detected further patterns with the patterns detected in the parameter values associated with the respective compression dictionaries; generating a predictive compression dictionary for the expected set of data entries based on the predicted recurrence frequency of the data entries in the expected set; receiving the expected set of data entries; and compressing at least part of the received expected set of data entries using the predictive compression dictionary. 2. The computer-implemented method of claim 1 , in which a parameter value of each parameter associated with a data entry is stored in a separate column of the columnar database. 3. The computer-implemented method of claim 1 , further comprising: compressing a defined fraction of the received expected set of data entries using the predictive compression dictionary; calculating a compression ratio for the compressed defined fraction of the received expected set of data entries; comparing the compression ratio with a target value; and, if a difference between the target value and the compression ratio is within a defined range: compressing the received expected set of data entries using the predictive compression dictionary; and storing the compressed received expected set of data entries in the empty partition. 4. The computer-implemented method of claim 3 , further comprising, if a difference between the target value and the compression ratio is outside the defined range: determining respective recurrence frequencies of the data entries in the defined fraction of the received expected set; generating an actual compression dictionary for the defined fraction of the received expected set based on the determined respective recurrence frequencies of the data entries in the defined fraction of the received expected set; augmenting the predictive compression dictionary for the expected set of data entries based on an evaluation of the actual compression dictionary; compressing the defined fraction of the received expected set of data entries using the augmented predictive compression dictionary; calculating a further compression ratio for the defined fraction of the received expected set of data entries compressed using the augmented predictive compression dictionary; comparing the further compression ratio with the target value; and, if a difference between the target value and the further compression ratio is within the defined range: compressing the received expected set of data entries using the augmented predictive compression dictionary; and storing the compressed received expected set of data entries in the empty partition. 5. The computer-implemented method of claim 3 , further comprising locking the columnar database during storing the compressed received expected set of data entries in the empty partition. 6. The computer-implemented method of claim 1 , in which the set of parameters includes at least one of meteorological parameters, economic parameters and temporal parameters. 7. A computer program product comprising a computer readable storage medium having computer readable program instructions embodied therewith for, when executed on a computer system for managing a columnar database comprising at least one column partitioned into a plurality of partitions including at least one empty partition and a plurality of filled partitions each comprising data entries associated with a set of parameters having parameter values relevant to a recurrence frequency of the data entry in the partition, the data entries being compressed in accordance with a compression dictionary based on respective recurrence frequencies of the data entries in the filled partition and comprising a processor arrangement adapted to execute the computer readable program instructions, cause the processor arrangement to: receive forecasted parameter values for the set of parameters for an expected set of data entries to be stored in an empty partition of the column; predict a recurrence frequency of the data entries in the expected set using the forecasted parameter values by evaluating data entry ranking histories associated with the respective compression dictionaries of the filled partitions with a machine learning algorithm, wherein: evaluating the data entry ranking histories associated with the respective compression dictionaries of the filled partitions with the machine learning algorithm comprises detecting a correlation between patterns in the parameter values associated with the respective compression dictionaries and the associated data entry ranking histories; predicting the recurrence frequency of the data entries in the expected set further comprises detecting further patterns in the forecasted parameter values and comparing the detected further patterns with the patterns detected in the parameter values associated with the respective compression dictionaries; generate a predictive compression dictionary for the expected set of data entries based on the predicted recurrence frequency of the data entries in the expected set; receive the expected set of data entries; and compress at least part of the received expected set of data entries using the predictive compression dictionary. 8. The computer program product of claim 7 , in which the computer readable program instructions further cause the processor arrangement to: compress a defined fraction of the received expected set of data entries using the predictive compression dictionary; calculate a compression ratio for the compressed defined fraction of the received expected set of data entries; compare the compression ratio with a target value; and, if a difference between the target value and the compression ratio is within a defined range: compress the received expected set of data entries using the predictive compression dictionary; and store the compressed received expected set of data entries in the empty partition. 9. The computer program product of claim 8 , in which the computer readable program instructions further cause the processor arrangement to, if a difference between the target value and the compression ratio is outside the defined range: determine respective r
Physics · mapped topic
Physics · mapped topic
Physics · mapped topic
Physics · mapped topic
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.