Method of reusing existing statistics to load database tables

US9916318B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9916318-B2
Application numberUS-201514982159-A
CountryUS
Kind codeB2
Filing dateDec 29, 2015
Priority dateDec 29, 2015
Publication dateMar 13, 2018
Grant dateMar 13, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An indication to load data into a database table is received. A determination is made whether an existing set of frequency distribution statistics is available. In response to determining that an existing set of frequency distribution statistics is available, the data is loaded into the database table using the existing set of frequency distribution statistics.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for using existing frequency distribution statistics to load data into database tables, the method comprising: receiving, by one or more computer processors, an indication to load data into a database table included in a database management system, wherein the data consists of a first portion of data and a second portion of data; determining, by one or more computer processors, whether an existing set of frequency distribution statistics for the data is available, wherein the existing set of frequency distribution statistics are generated independently of the database management system prior to loading the data into the database management system, and wherein the existing set of frequency distribution statistics is generated using substantially all of the data; responsive to determining that an existing set of frequency distribution statistics is not available for the first portion of data but an existing set of frequency distribution statistics is available for the second portion of data, generating, by one or more computer processors, a set of frequency distribution statistics for the first portion of data, wherein the generated set of frequency distribution statistics is generated by the database management system using a sample of the first portion of data; responsive to generating a set of frequency distribution statistics for the first portion of data that does not have the existing set of frequency distribution statistics available, creating, by one or more computer processors, a first compression dictionary for the first portion of data using the generated set of frequency distribution statistics; creating, by one or more computer processors, a second compression dictionary for the second portion of data using the existing set of frequency distribution statistics; and loading, by one or more computer processors, the data into the database table using the first compression dictionary and the second compression dictionary. 2. The method of claim 1 , further comprising: responsive to determining that an existing set of frequency distribution statistics for the data is not available, generating, by one or more computer processors, a set of frequency distribution statistics for the data; creating, by one or more computer processors, a third compression dictionary using the generated set of frequency distribution statistics; and loading, by one or more computer processors, the data into the database table using the third compression dictionary. 3. The method of claim 2 , wherein the first compression dictionary, the second compression dictionary, and the third compression dictionary are each a simple lookup table which uses fewer bits to store data when compared to the data itself. 4. A computer program product for using existing frequency distribution statistics to load data into a database table, the computer program product comprising: one or more computer readable storage media; and program instruction stored on the one or more computer readable storage media, the program instructions comprising: program instructions to receive an indication to load data into a database table included in a database management system, wherein the data consists of a first portion of data and a second portion of data; program instructions to determine whether an existing set of frequency distribution statistics for the data is available, wherein the existing set of frequency distribution statistics are generated independently of the database management system prior to loading the data into the database management system, and wherein the existing set of frequency distribution statistics is generated using substantially all of the data; responsive to determining that an existing set of frequency distribution statistics is not available for the first portion of data but an existing set of frequency distribution statistics is available for the second portion of data, program instructions to generate a set of frequency distribution statistics for the first portion of data, wherein the generated set of frequency distribution statistics is generated by the database management system using a sample of the first portion of data; responsive to generating a set of frequency distribution statistics for the first portion of data that does not have the existing set of frequency distribution statistics available, program instructions to create a first compression dictionary for the first portion of data using the generated set of frequency distribution statistics; program instructions to create a second compression dictionary for the second portion of data using the existing set of frequency distribution statistics for the second portion of data; and program instructions to load the data into the database table using the first compression dictionary and the second compression dictionary. 5. The computer program product of claim 4 , further comprising program instructions stored on the one or more computer readable storage media, to: responsive to determining that an existing set of frequency distribution statistics for the data is not available, generate a set of frequency distribution statistics for the data; create a third compression dictionary using the generated set of frequency distribution statistics; and load the data into the database table using the third compression dictionary. 6. The computer program product of claim 5 , wherein the first compression dictionary the second compression dictionary, and the third compression dictionary are each a simple lookup table which uses fewer bits to store data when compared to the data itself. 7. A computer system for using existing frequency distribution statistics to load data into a database table, the computer system comprising: one or more computer processors; one or more computer readable storage media; and program instruction stored on the one or more computer readable storage media for execution by at least one of the one or more computer processors, the program instructions comprising: program instructions to receive an indication to load data into a database table included in a database management system, wherein the data consists of a first portion of data and a second portion of data; program instructions to determine whether an existing set of frequency distribution statistics for the data is available, wherein the existing set of frequency distribution statistics are generated independently of the database management system prior to loading the data into the database management system, and wherein the existing set of frequency distribution statistics is generated using substantially all of the data; responsive to determining that an existing set of frequency distribution statistics is not available for the first portion of data but an existing set of frequency distribution statistics is available for the second portion of data, program instructions to generate a set of frequency distribution statistics for the first portion of data, wherein the generated set of frequency distribution statistics is generated by the database management system using a sample of the first portion of data; responsive to generating a set of frequency distribution statistics for the first portion of data that does not have the existing set of frequency distribution statistics available, program instructions to create a first compression dictionary for the first portion of data using the generated set of frequency distribution statistics; program instructions to create a second compression dictionary for the second portion of data using the existing set of frequency distribution statistics for the second portion of data; and program instructions to load the data into the database table using the first compressio

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9916318B2 cover?
An indication to load data into a database table is received. A determination is made whether an existing set of frequency distribution statistics is available. In response to determining that an existing set of frequency distribution statistics is available, the data is loaded into the database table using the existing set of frequency distribution statistics.
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F17/30153. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 13 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).