Who is the assignee on this patent?

Faerber Franz, Radestock Guenter, Ross Andrew, and 1 more

What technology area does this patent fall under?

Primary CPC classification H03M7/3084. Mapped technology areas include Electricity.

When was this patent published?

Publication date Tue Sep 20 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Block compression of tables with repeated values

US9450605B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9450605-B2
Application number	US-201213674477-A
Country	US
Kind code	B2
Filing date	Nov 12, 2012
Priority date	May 21, 2007
Publication date	Sep 20, 2016
Grant date	Sep 20, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and apparatus, including computer program products, for block compression of tables with repeated values. In general, value identifiers representing a compressed column of data may be sorted to render repeated values contiguous, and block dictionaries may be generated. A block dictionary may be generated for each block of value identifiers. Each block dictionary may include a list of block identifiers, where each block identifier is associated with a value identifier and there is a block identifier for each unique value in a block. Blocks may have standard sizes and block dictionaries may be reused for multiple blocks.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer program product, embodied in a non-transitory computer-readable storage medium, the computer program product being operable to cause at least one data processing apparatus to perform operations comprising: performing, in parallel, block compression for each of a plurality of columns of data; storing, for each column of data, changes to the column of data after the block compression is performed in a different one of a plurality of delta buffers, wherein the delta buffers are separate from the corresponding columns of data; and asynchronously integrating, for each column of data, changes stored in the corresponding delta buffer into the corresponding column of data, wherein the columns of data and the delta buffers are searchable prior to the integration, and wherein search results from the columns of data and the delta buffers are merged to form a composite result. 2. A computer program product as in claim 1 , wherein performing block compression comprises, for each column: compressing the column of data with dictionary-based compression, the compressing comprising generating a column of value identifiers, each of the value identifiers representing a unique value in the column of data; sorting values represented in the one or more columns of data, the sorting including ordering the one or more columns of data such that the one or more columns of data are ordered in order of frequency of most frequently occurring value in different columns of the one or more columns of data; generating a bit vector representation of each column representing whether a value in rows of each column occurs frequently; generating a number representing frequency of the most frequently occurring value for each column; removing bits corresponding to the most frequently occurring value from the bit vector to obtain a shortened bit vector, the shortened bit vector obtained for each column; and storing the generated number and the shortened bit vector for each column, the stored number and the shortened bit vector for each column representing the compressed data, the compressed data being decompressed when a search is performed, the decompressed data being compressed again after the search is performed. 3. A computer program product as in claim 1 , wherein the compressing is initiated when a column of data comprises one value that has a repetition frequency significantly higher than a repetition frequency of other values in the column. 4. A computer program product as in claim 1 , wherein the value identifiers are values representing structured business data having data dependencies across a same row of a table. 5. A computer program product as in claim 4 , wherein the business data comprises business objects modeled as sets of joined tables. 6. A computer program product as in claim 5 , wherein the operations of the product are performed in parallel on a plurality of hardware servers. 7. A computer program product as in claim 1 , wherein the search comprises searching for data in the one or more columns of data. 8. A computer-implemented method comprising: performing, in parallel, block compression for each of a plurality of columns of data; storing, for each column, changes to the column of data after the block compression is performed in a different one of a plurality of delta buffers, wherein the delta buffers are separate from the corresponding columns of data; and asynchronously integrating, for each column, changes stored in the delta buffer into the column of data, wherein, for each column, both of the column of data and the delta buffer are searchable prior to the integration, wherein search results from the columns of data and the delta buffers are merged to form a composite result. 9. A method as in claim 8 , wherein performing block compression comprises, for each column: compressing the column of data with dictionary-based compression, the compressing comprising generating a column of value identifiers, each of the value identifiers representing a unique value in the column of data; sorting values represented in the one or more columns of data, the sorting including ordering the one or more columns of data such that the one or more columns of data are ordered in order of frequency of most frequently occurring value in different columns of the one or more columns of data; generating a bit vector representation of each column representing whether a value in rows of each column occurs frequently; generating a number representing frequency of the most frequently occurring value for each column; removing bits corresponding to the most frequently occurring value from the bit vector to obtain a shortened bit vector, the shortened bit vector obtained for each column; and storing the generated number and the shortened bit vector for each column, the stored number and the shortened bit vector for each column representing the compressed data, the compressed data being decompressed when a search is performed, the decompressed data being compressed again after the search is performed. 10. A method as in claim 8 , wherein the compressing is initiated when a column of data comprises one value that has a repetition frequency significantly higher than a repetition frequency of other values in the column. 11. A method as in claim 8 , wherein the value identifiers are values representing structured business data having data dependencies across a same row of a table. 12. A method as in claim 8 , wherein the business data comprises business objects modeled as sets of joined tables. 13. A method as in claim 12 , wherein the operations of the product are performed in parallel on a plurality of hardware servers. 14. A method as in claim 8 , wherein the search comprises searching for data in the one or more columns of data. 15. A method as in claim 8 , wherein the performing, storing, and integrating are performed by at least one data processor forming part of at least one computing system. 16. A system comprising: at least one programmable processor; and memory storing instructions that, when executed by the at least one programmable processor, cause the at least one programmable processor to perform operations comprising: performing, in parallel, block compression for each of a plurality of columns of data, the block compression comprising generating, for each column, a block offset column such that each value of the block offset column indicates an offset at which a block starts in the one or more columns of data; storing, for each column, changes to the column of data after the block compression is performed in a different one of a plurality of delta buffers, wherein the delta buffers are separate from the columns of data; asynchronously integrating, for each column, changes stored in the corresponding delta buffer into the corresponding column of data; receiving a query; and searching both the compressed column of data and the delta buffer for data responsive to the query. 17. A system as in claim 16 , wherein performing block compression comprises, for each column: compressing the column of data with dictionary-based compression, the compressing comprising generating a column of value identifiers, each of the value identifiers representing a unique value in the column of data; sorting values represented in the one or more columns of data, the sorting including ordering the one or more columns of data such that the one or more columns of data are ordered in order of frequency of most frequently occurring value in different columns of the one or mor

Assignees

Inventors

Classifications

G06F17/30315
Physics · mapped topic
H03M7/3084Primary
using adaptive string matching, e.g. the Lempel-Ziv method · CPC title
H03M7/3088
employing the use of a dictionary, e.g. LZ78 · CPC title
G06F16/2228
Indexing structures · CPC title
G06F16/1744Primary
using compression, e.g. sparse files · CPC title

Patent family

Related publications grouped by family.

View patent family 39777031

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9450605B2 cover?: Methods and apparatus, including computer program products, for block compression of tables with repeated values. In general, value identifiers representing a compressed column of data may be sorted to render repeated values contiguous, and block dictionaries may be generated. A block dictionary may be generated for each block of value identifiers. Each block dictionary may include a list of bl…
Who is the assignee on this patent?: Faerber Franz, Radestock Guenter, Ross Andrew, and 1 more
What technology area does this patent fall under?: Primary CPC classification H03M7/3084. Mapped technology areas include Electricity.
When was this patent published?: Publication date Tue Sep 20 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).