Hierarchical data compression and computation

US9350384B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9350384-B2
Application numberUS-201414501790-A
CountryUS
Kind codeB2
Filing dateSep 30, 2014
Priority dateSep 30, 2014
Publication dateMay 24, 2016
Grant dateMay 24, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

According to embodiments of the present invention, machines, systems, methods and computer program products for hierarchical compression of data are presented comprising creating a compression hierarchy of compression nodes, wherein each compression node is associated with a compression operation to produce compressed data. An output of any of the compression nodes may be compressed by another compression node or the same compression node. A path of one or more compression nodes is determined through said compression hierarchy based upon compression statistics to compress data, and the data is compressed by the compression nodes of the path. Various computational techniques are presented herein for manipulating the compression hierarchy to defer or reduce computation during query evaluation.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for hierarchically compressing data comprising: one or more processors configured to: create a compression hierarchy of compression nodes, wherein each compression node is associated with a compression operation to produce compressed data, and wherein creating a compression hierarchy comprises: creating a dictionary compression node in the compression hierarchy of compression nodes; determine a path of one or more compression nodes through said compression hierarchy, based upon compression statistics, to compress data; and compress the data by the compression nodes of the path, wherein compressing the data comprises: performing a computation on the data using the dictionary compression node, wherein the computation is performed by applying bitshaved data to the dictionary compression node to generate a dictionary having values and keys; and performing the computation for each generated value. 2. The system of claim 1 , wherein the one or more processors are configured to select a path of the compression hierarchy having a measure of compression higher than any of the other paths. 3. The system of claim 1 , wherein the compression nodes are configured to compress one or more data types selected from a group consisting of: integer, character and double. 4. The system of claim 1 , wherein each compression node is selected from a group consisting of: a bitshaved compression node, a dictionary compression node, a run length encoding compression node, a character-based compression node and a delta compression node. 5. The system of claim 1 , wherein the one or more processors are configured to: create a run length encoded compression node in the compression hierarchy of compression nodes; and perform a filtering operation on the data using the run length encoded compression node, wherein the filtering operation is performed in part by applying a bitmask to a values field of the run length encoded compression node. 6. The system of claim 1 , wherein the one or more processors are configured to: create a run length encoded compression node in the compression hierarchy of compression nodes; and perform part of a join operation on the data using the run length encoded compression node, wherein the join operation is performed by applying input data to a values field of the run length encoded compression node and by applying repeat counts to a lengths field of the run length encoded compression node. 7. A computer program product for hierarchically compressing data comprising a computer readable storage medium having computer readable program code embodied therewith, the computer readable program code, when executed by a processor, causes the processor to: create a compression hierarchy of compression nodes, wherein each compression node is associated with a compression operation to produce compressed data, and wherein creating a compression hierarchy comprises: creating a dictionary compression node in the compression hierarchy of compression nodes; determine a path of one or more compression nodes through said compression hierarchy, based upon compression statistics, to compress data; and compress the data by the compression nodes of the path, wherein compressing the data comprises: performing a computation on the data using the dictionary compression node, wherein the computation is performed by applying bitshaved data to the dictionary compression node to generate a dictionary having values and keys; and performing the computation for each generated value. 8. The computer program product of claim 7 , wherein the computer readable program code is configured to cause the processor to select a path of the compression hierarchy having a measure of compression higher than any of the other paths. 9. The computer program product of claim 7 , wherein each compression node is selected from a group consisting of: a bitshaved compression node, a dictionary compression node, a run length encoding compression node, a character-based compression node and a delta compression node. 10. The computer program product of claim 7 , wherein the computer readable program code is configured to cause the processor to: create a run length encoded compression node in the compression hierarchy of compression nodes; and perform a filtering operation on the data using the run length encoded compression node, wherein the filtering operation is performed in part by applying a bitmask to a values field of the run length encoded compression node. 11. The computer program product of claim 7 , wherein the computer readable program code is configured to cause the processor to: create a run length encoded compression node in the compression hierarchy of compression nodes; and perform part of a join operation on the data using the run length encoded compression node, wherein the join operation is performed by applying input data to a values field of the run length encoded compression node and by applying repeat counts to a lengths field of the run length encoded compression node.

Assignees

Inventors

Classifications

  • H03M7/46Primary

    Conversion to or from run-length codes, i.e. by representing the number of consecutive digits, or groups of digits, of the same kind by a code word and a digit indicative of that kind · CPC title

  • H03M7/6035Primary

    Handling of unkown probabilities · CPC title

  • Search customisation based on user profiles and personalisation · CPC title

  • Encoder aspects · CPC title

  • employing the use of a dictionary, e.g. LZ78 · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9350384B2 cover?
According to embodiments of the present invention, machines, systems, methods and computer program products for hierarchical compression of data are presented comprising creating a compression hierarchy of compression nodes, wherein each compression node is associated with a compression operation to produce compressed data. An output of any of the compression nodes may be compressed by another …
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification H03M7/46. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue May 24 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).