Generating row-based and column-based chunks
US-2015381647-A1 · Dec 31, 2015 · US
US11520743B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11520743-B2 |
| Application number | US-201314079507-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 13, 2013 |
| Priority date | Apr 30, 2009 |
| Publication date | Dec 6, 2022 |
| Grant date | Dec 6, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A database server stores compressed units in data blocks of a database. A table (or data from a plurality of rows thereof) is first compressed into a “compression unit” using any of a wide variety of compression techniques. The compression unit is then stored in one or more data block rows across one or more data blocks. As a result, a single data block row may comprise compressed data for a plurality of table rows, as encoded within the compression unit. Storage of compression units in data blocks maintains compatibility with existing data block-based databases, thus allowing the use of compression units in preexisting databases without modification to the underlying format of the database. The compression units may, for example, co-exist with uncompressed tables. Various techniques allow a database server to optimize access to data in the compression unit, so that the compression is virtually transparent to the user.
Opening claim text (preview).
What is claimed is: 1. A method comprising: generating a plurality of compression units in which to store a database table, each particular compression unit of the plurality of compression units storing respective separate table rows from said database table, wherein generating each particular compression unit comprises: compressing at least a first column of the respective separate table rows of said each particular compression unit in a column-major format in a first subunit of said each particular compression unit; compressing at least a second column of the respective separate table rows of said each particular compression unit in a column-major format in a second subunit of said each particular compression unit; storing the plurality of compression units in a plurality of data blocks by, for each particular compression unit of the plurality of compression units, storing said particular compression unit in a data block row chain that spans multiple data blocks of said plurality of data blocks, wherein said data block row chain contains said first column and said second column; wherein the method is performed by one or more computing devices. 2. The method of claim 1 , wherein the database table is a first database table in a database, wherein the plurality of data blocks are a first set of data blocks within another plurality of data blocks that store the database, wherein the other plurality of data blocks include a second set of data blocks that store, in row major format, uncompressed data for a second database table in the database. 3. The method of claim 1 , further comprising: receiving a request that requires access to at least a first table row of the database table; based on an index entry corresponding to the first table row, determining that the first table row is stored in at least a first data block; retrieving the first data block; determining that the first data block stores a portion of a first compression unit; retrieving a second set of data blocks that store other portions of the first compression unit; decompressing the first compression unit; further based on the index entry corresponding to the first table row, identifying the first table row within the first compression unit. 4. The method of claim 1 , wherein generating each particular compression unit further comprises generating a compression unit header, the compression unit header comprising metadata that indicates how the first column and the second column are compressed. 5. The method of claim 1 , wherein the first column and the second column are compressed with different compression schemes. 6. The method of claim 1 , further comprising: for each particular compression unit of the plurality of compression units: generating a header that at least indicates a number of data blocks in the data block row chain that stores said each particular compression unit; storing the header in front of said each particular compression unit in the data block row chain; and prefetching one or more data blocks in which said each particular compression unit is stored based on the header. 7. The method of claim 1 , further comprising: for each particular compression unit of the plurality of compression units, generating metadata that at least indicates a location within said each particular compression unit at which compressed data for the second column begins; and responsive to a request for which the first column is not needed, decompressing a particular second subunit of a first compression unit of said plurality of compression units without decompressing a particular first subunit of the first compression unit based on the metadata generated for the first compression unit that at least indicates a location within said first compression unit at which compressed data for the second column begins. 8. The method of claim 1 , further comprising: for each particular compression unit of the plurality of compression units, generating metadata that at least indicates a location within the particular compression unit at which compressed data for the second column begins; responsive to a request for which the first column is not needed, retrieving second data blocks in which a particular second subunit of a first compression unit of said plurality of compression units is stored without retrieving one or more first data blocks in which a particular first subunit of said first compression unit is stored based on the metadata in the first compression unit that at least indicates a location within the first compression unit at which compressed data for the second column begins. 9. The method of claim 1 , further comprising: responsive to a first database request: retrieving a first set of data blocks in which a first compression unit of the plurality of compression units is stored; decompressing the first compression unit; after decompressing the first compression unit: temporarily storing the first compression unit in a buffer; servicing a second database request using the first compression unit in the buffer, without re-retrieving the first set of data blocks. 10. The method of claim 1 , further comprising: dividing the database table into groups of rows, each of the groups of rows corresponding to a different one of the plurality of compression units; dividing each particular compression unit of the plurality of compression units into portions based on a default data block size, so that each portion of said each particular compression unit fits into a different one of the data blocks. 11. One or more non-transitory computer-readable media storing instructions that, when executed by one or more computing devices, cause: generating a plurality of compression units in which to store a database table, each particular compression unit of the plurality of compression units storing respective separate table rows from said database table, wherein generating each particular compression unit comprises: compressing at least a first column of the respective separate table rows of said each particular compression unit in a column-major format in a first subunit of said each particular compression unit; compressing at least a second column of the respective separate table rows of said each particular compression unit in a column-major format in a second subunit of said each particular compression unit; storing the pluarlity of compression units in a pluarlity of data blocks by, for each particular compression unit of the pluarlity of compression units, storing said particular compression unit in a data block row chain that spans multiple data blocks of said plurality of data blocks, wherein said data block row chain contains said first column and said second column. 12. The one or more non-transitory computer-readable media of claim 11 , wherein the database table is a first database table in a database, wherein the plurality of data blocks are a first set of data blocks within another plurality of data blocks that store the database, wherein the other plurality of data blocks include a second set of data blocks that store, in row major format, uncompressed data for a second database table in the database. 13. The one or more non-transitory computer-readable media of claim 11 , wherein the instructions, when executed by the one or more computing devices, further cause: receiving a request that requires access to at least a first table row of the database table; based on an index entry corresponding to the first table row, determining that the first table row is stored in at least a first data block; retrieving the first data block; determining that the first data bl
using more than one table in sequence, i.e. systems with three or more layers · CPC title
Intermediate data storage techniques for performance improvement · CPC title
using compression, e.g. sparse files · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.