Column domain dictionary compression

US10756759B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10756759-B2
Application numberUS-201113224327-A
CountryUS
Kind codeB2
Filing dateSep 2, 2011
Priority dateSep 2, 2011
Publication dateAug 25, 2020
Grant dateAug 25, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In column domain dictionary compression, column values in one or more columns are tokenized by a single dictionary. The domain of the dictionary is the entire set of columns. A dictionary may not only map a token to a tokenized value, but also to a count (“token count”) of the number of occurrences of the token and corresponding tokenized value in the dictionary's domain. Such information may be used to compute queries on the base table.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, the method comprising steps of: storing in an in-memory database a dictionary table that has one or more entries, wherein each entry of said one or more entries maps a single token to an actual column value contained in a corresponding column of a base table that is stored in said in-memory database, said dictionary table being stored and maintained as a data structure separate from said base table, said single token comprising an offset to a memory address to indicate a storage location of a row containing said entry mapping said single token to said actual column value; storing in said corresponding column of said base table a particular single token in lieu of a particular actual column value mapped by a particular entry of said one or more entries to said particular single token; and computing a query that conforms to a database language, wherein computing a query includes decoding the particular single token to generate the particular actual column value mapped by said particular entry to said respective single token. 2. The method of claim 1 , the steps further including rewriting another query that references said base table but not said dictionary table to create a transformed query that references said dictionary table. 3. The method of claim 1 , the steps further including receiving a DDL statement that defines a compression group on said base table. 4. A method, the method comprising steps of: storing a dictionary having one or more entries, wherein each entry of said one or more entries maps a single token to a combination of actual column values contained in at least two corresponding columns of a base table, said at least two corresponding columns comprising a compression group on said base table, said dictionary being stored and maintained as a data structure separate from said base table; for a particular entry of said dictionary table, storing in said compression group of said base table the respective single token of said particular entry in lieu of the respective combination of actual column values that is mapped by said particular entry to said respective single token; and computing a query that conforms to a database language, wherein computing a query includes decoding the respective single token of said particular entry to generate the respective combination of actual column values that is mapped by said particular entry to said respective single token. 5. The method of claim 4 , wherein said dictionary is a dictionary table; and the steps further including rewriting another query, that references said base table but not said dictionary table, into a transformed query that references the dictionary table. 6. The method of claim 5 , wherein said other query references one of said at least two columns but does not reference another of said at least two columns. 7. A method, the method comprising steps of: storing a dictionary table having one or more rows, wherein each row of said one or more rows maps a single token to an actual column value contained in a column of a base table, said dictionary table being stored and maintained as a data structure separate from said base table; storing in said column of said base table a particular single token in lieu of a particular actual column value mapped by a particular row of said one or more rows to said particular single token; and rewriting a query, that references said base table but not said dictionary table, into a transformed query that references the dictionary table. 8. The method of claim 7 , the steps further including receiving a DDL statement that defines a compression group on said base table that includes said column. 9. A non-transitory computer-readable storage medium storing one or more sequences of instructions, said one or more sequences of instructions, which, when executed by one or more processors, causes the one or more processors to perform steps of: storing in an in-memory database a dictionary table that has one or more entries, wherein each entry of said one or more entries maps a single token to an actual column value contained in a corresponding column of a base table that is stored in said in-memory database, said dictionary table being stored and maintained as a data structure separate from said base table, said single token comprising an offset to a memory address to indicate a storage location of a row containing said entry mapping said single token to said actual column value; storing in said corresponding column of said base table a particular single token in lieu of of a particular actual column value mapped by a particular entry of said one or more entries to said particular single token; and computing a query that conforms to a database language, wherein computing a query includes decoding the particular single token to generate the particular actual column value mapped by said particular entry to said respective single token. 10. The non-transitory computer-readable storage medium of claim 9 , the steps further including rewriting another query that references said base table but not said dictionary table to create a transformed query that references said dictionary table. 11. The non-transitory computer-readable storage medium of claim 9 , the steps further including receiving a DDL statement that defines a compression group on said base table. 12. A non-transitory computer-readable storage medium storing one or more sequences of instructions, said one or more sequences of instructions, which, when executed by one or more processors, causes the one or more processors to perform steps of: storing a dictionary having one or more entries, wherein each entry of said one or more entries maps a single token to a combination of actual column values contained in at least two corresponding columns of a base table, said at least two corresponding columns comprising a compression group on said base table, said dictionary being stored and maintained as a data structure separate from said base table; for a particular entry of said dictionary table, storing in said compression group of said base table the respective single token of said particular entry in lieu of the respective combination of actual column values that is mapped by said particular entry to said respective single token; and computing a query that conforms to a database language, wherein computing a query includes decoding the respective single token of said particular entry to generate the respective combination of actual column values that is mapped by said particular entry to said respective single token. 13. The non-transitory computer-readable storage medium of claim 12 , wherein said dictionary is a dictionary table; and the steps further including rewriting another query, that references said base table but not said dictionary table, into a transformed query that references the dictionary table. 14. The non-transitory computer-readable storage medium of claim 13 , wherein said other query references one of said at least two columns but does not reference another of said at least two columns. 15. A non-transitory computer-readable storage medium storing one or more sequences of instructions, said one or more sequences of instructions, which, when executed by one or more processors, causes the one or more processors to perform steps of: storing a dictionary table having one or more rows, wherein each row of said one or more rows maps a single token to an actual column value contained in a column of a base table, said dictionary table being stored and maintained as a data structure separate from said base table;

Assignees

Inventors

Classifications

  • Column-oriented storage; Management thereof · CPC title

  • employing the use of a dictionary, e.g. LZ78 · CPC title

  • H03M7/42Primary

    using table look-up for the coding or decoding process, e.g. using read-only memory {(H03M7/4006 takes precedence)} · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10756759B2 cover?
In column domain dictionary compression, column values in one or more columns are tokenized by a single dictionary. The domain of the dictionary is the entire set of columns. A dictionary may not only map a token to a tokenized value, but also to a count (“token count”) of the number of occurrences of the token and corresponding tokenized value in the dictionary's domain. Such information may b…
Who is the assignee on this patent?
Lahiri Tirthankar, Hoang Chi-Kim, Thomas Dina, and 6 more
What technology area does this patent fall under?
Primary CPC classification H03M7/42. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Aug 25 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).