Intra-block partitioning for database management

US9535939B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9535939-B2
Application numberUS-201213485707-A
CountryUS
Kind codeB2
Filing dateMay 31, 2012
Priority dateMay 31, 2012
Publication dateJan 3, 2017
Grant dateJan 3, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for storing database information includes storing a table having data values in a column major order. The data values are stored in a list of blocks. The method also includes assigning a tuple sequence number (TSN) to each data value in each column of the table according to a sequence order in the table. The data values that correspond to each other across a plurality of columns of the table have equivalent TSNs. The method also includes assigning each data value to a partition based on a representation of the data value. The method also includes assigning a tuple map value to each data value. The tuple map value identifies the partition in which each data value is located.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer program product, comprising: a non-transitory computer readable storage medium to store a computer readable program, wherein the computer readable program, when executed by a processor within a computer, causes the computer to perform operations for storing database information, the operations comprising: storing a table comprising data values in a column major order, wherein the data values comprise encoded and unencoded data values, wherein the data values within each column are stored in a list of blocks, wherein each block comprises a predetermined block size determined according to a configuration of a database in which the table is stored, wherein each block comprises an array; assigning a tuple sequence number (TSN) to each data value in each column of the table according to a sequence order in the table, wherein data values that correspond to each other across a plurality of columns of the table have equivalent TSNs; identifying a bit length of each encoded data value and an encoding representation of each encoded data value; assigning each encoded data value in a block to one of a plurality of partitions within the block based on the bit length of each encoded data value and the encoding representation of each encoded data value, and assigning the unencoded data values to a separate partition from the encoded data values, the unencoded data values stored in an unencoded format and each other partition comprising data values stored in a distinct encoding format, representing each partition in the array within the block, wherein the array is an index separate from the plurality of partitions; indexing the data values in the block in an array contained in the block by assigning a tuple map value to each data value, wherein the tuple map value identifies the partition in which each data value is located; and storing the tuple map value in the array within the block. 2. The computer program product of claim 1 , wherein the computer readable program, when executed on the computer, causes the computer to perform additional operations, comprising: receiving a new record comprising related data values for insertion into the plurality of columns in the table; inserting the new record into a buffer; ordering records within the buffer based on the representation of the data values in each record to preserve a TSN order of the records among values comprising identical representation; appending the data values of the records from the buffer to partitions in a trailing block of each corresponding column; and assigning a new tuple map value to each data value of the records for the corresponding partitions. 3. The computer program product of claim 1 , wherein the computer readable program, when executed on the computer, causes the computer to perform additional operations, comprising: receiving a query that performs operations on a subset of the table identified by one or more predicates; evaluating the predicates within each partition independently; producing one bitmap per partition, wherein each bitmap indicates only data values which pass the corresponding predicates; and merging all bitmaps using the tuple map values to form a combined bitmap for the corresponding block. 4. The computer program product of claim 3 , wherein the computer readable program, when executed on the computer, causes the computer to perform additional operations, comprising: loading the data values that pass the predicates from the combined bitmap for a region, wherein the region comprises one partition in the corresponding block; forming an array of source indexes in each region and destinations indexes in an output vector; and moving the data values from a first location specified by the source indexes to a second location specified by the destination indexes. 5. The computer program product of claim 1 , wherein the computer readable program, when executed on the computer, causes the computer to perform additional operations, comprising: sequentially scanning a tuple map comprising the tuple map values; maintaining a cursor in each partition in the tuple map; and accessing a given data value from the cursor on an indicated partition in response to encountering the indicated partition while scanning the tuple map; and incrementing the cursor. 6. The computer program product of claim 1 , wherein assigning each data value to a partition further comprises assigning the data values within blocks of the table. 7. A database management system, comprising: a processor configured to: store a table comprising data values in a column major order on a data storage device, wherein the data values comprise encoded and unencoded data values, wherein the data values within each column are stored in a list of blocks, wherein each block comprises a predetermined block size determined according to a configuration of a database in which the table is stored, wherein each block comprises an array; and a table management engine, executed by the processor, configured to: assign a tuple sequence number (TSN) to each data value in each column of the table according to a sequence order in the table, wherein data values that correspond to each other across a plurality of columns of the table have equivalent TSNs; identify a bit length of each encoded data value and an encoding representation of each encoded data value; assign each encoded data value in a block to one of a plurality of partitions within the block based on the bit length of each encoded data value and the encoding representation of each encoded data value, and assign the unencoded data values to a separate partition from the encoded data values, the unencoded data values stored in an unencoded format and each other partition comprising data values stored in a distinct encoding format, represent each partition in the array within the block, wherein the array is an index separate from the plurality of partitions; index the data values in the block in an array contained in the block by assigning a tuple map value to each data value, wherein the tuple map value identifies the partition in which each data value is located; and store the tuple map values in the array within the block. 8. The system of claim 7 , wherein the table management engine is further configured to: receive a new record comprising related data values for insertion into the plurality of columns in the table; insert the new record into a buffer; order records within the buffer based on the representation of the data values in each record to preserve a TSN order of the records among values comprising identical representation; append the data values of the records from the buffer to partitions in a trailing block of each corresponding column; and assign a new tuple map value to each data value of the records for the corresponding partitions. 9. The system of claim 7 , wherein the table management engine is further configured to: receive a query that performs operations on a subset of the table identified by one or more predicates; evaluate the predicates within each partition independently; produce one bitmap per partition, wherein each bitmap indicates only data values which pass the corresponding predicates; and merge all bitmaps using the tuple map values to form a combined bitmap for the corresponding block. 10. The system of claim 9 , wherein the table management engine is further configured to: load the data values that pass the predicates from the combined bitmap for a region, wherein the region comprises one partition in the corresponding block; form an array of source indexes in each region and destinations indexes in an output vector; and mov

Assignees

Inventors

Classifications

  • Data partitioning, e.g. horizontal or vertical partitioning · CPC title

  • G06F16/221Primary

    Column-oriented storage; Management thereof · CPC title

  • Physics · mapped topic

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9535939B2 cover?
A method for storing database information includes storing a table having data values in a column major order. The data values are stored in a list of blocks. The method also includes assigning a tuple sequence number (TSN) to each data value in each column of the table according to a sequence order in the table. The data values that correspond to each other across a plurality of columns of the…
Who is the assignee on this patent?
Barber Ronald J, Kim Min-Soo, Lightstone Sam S, and 6 more
What technology area does this patent fall under?
Primary CPC classification G06F16/221. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 03 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).