Derived data dictionary for optimizing transformations of encoded data

US11042544B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11042544-B2
Application numberUS-201816210049-A
CountryUS
Kind codeB2
Filing dateDec 5, 2018
Priority dateDec 5, 2018
Publication dateJun 22, 2021
Grant dateJun 22, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A database-management system evaluates a query that retrieves and transforms encoded symbols stored in a database. If the stored symbols assume a relatively small set of distinct values, the system initially performs the transformation on every value in the set. During execution of subsequent queries, rather than performing the transformation upon every stored symbol fetched from the database, the system merely returns the previously derived encoded transformation results that correspond to the decoded value of each fetched symbol. If the symbols stored in the database span a relatively large set of distinct values, the system does not initially perform the transformation upon every value in the set. Instead, the first time the system fetches a symbol that has a particular value, it saves that symbol's encoded transformation result and reuses that result the next time it fetches an encoded symbol with the same value.

First claim

Opening claim text (preview).

What is claimed is: 1. A DBMS query-processing system comprising a processor, a memory coupled to the processor, and a computer-readable hardware storage device coupled to the processor, the storage device containing program code configured to be run by the processor via the memory to implement a method for a derived data dictionary for optimizing transformations of encoded data, the method comprising: the system receiving a query expressly requesting a transformation to be performed upon encoded symbols stored in a database column, where performing the transformation upon any encoded symbol does not require decoding the any encoded symbol upon which the transformation is performed, where each encoded symbol value corresponds to a corresponding unencoded value, and where at least two of the encoded symbol values are indistinguishable; the system generating a derived data dictionary (DDD) table in which each entry associates one distinct encoded symbol value stored in the column with a corresponding transformation result, where a value of a first transformation result is equal to an output value produced by performing the expressly requested transformation upon a value of a first encoded symbol associated by the DDD table with the first transformation result; the system selecting a DDD-generation strategy from the group consisting of: initially populating the DDD table, before fetching any symbols from the column, by adding to each table entry a transformation result derived by performing the transformation upon a symbol value identified by the entry, where the initially populating further comprises: performing the transformation on each distinct symbol value listed in the DDD table; and storing a result of each transformation as a transformation result in a DDD table entry that identifies a corresponding value upon which the transformation was performed; and incrementally populating the DDD table by adding, to a table entry that identifies a value of a fetched symbol, a transformation result derived by performing the transformation upon the identified symbol value only if a value of the fetched symbol is not equal to a value of any previously fetched symbol, where the incrementally populating further comprises: fetching the first symbol from the database column; determining whether an existing DDD table entry associates a value of the first symbol with the first transformation result; and if determining that no DDD table entry associates the value of the first symbol with the first transformation result, performing the transformation on the value of the first symbol and storing the result of the performed transformation as the first transformation result in the DDD table entry associated with the first symbol; and the system returning, for each symbol stored in the column, a corresponding returned result, where a returned result corresponding to a particular symbol is equal to a transformation result associated, by a DDD table entry, with a value of the particular symbol. 2. The system of claim 1 , where the DDD table is initialized as a duplicate of a cross-reference table created and maintained by the DBMS, where each entry of the cross-reference table associates one distinct symbol value stored in the column with that distinct symbol's corresponding unencoded value. 3. The system of claim 1 , where the selecting a DDD-generation strategy comprises comparing a total number of symbols stored in the column to a total number of distinct symbol values stored in the column. 4. The system of claim 1 , where the encoded symbols stored in the database column each represent a prefix or suffix substring of a full string that comprises one instance of the prefix or suffix substring and a corresponding base string represented by an encoded base-string symbol, the method further comprising: fetching a first encoded substring symbol from the database column and a corresponding first encoded base-string symbol from a second column of the database; performing the transformation on the first encoded base-string symbol; and reconstituting the first full string by concatenating a decoded transformed substring represented by the first encoded substring symbol with a decoded transformed base string represented by the result of performing the transformation on the first encoded base-string symbol. 5. A method comprising: a DBMS query-processing system receiving a query expressly requesting a transformation to be performed upon encoded symbols stored in a database column, where performing the transformation upon any encoded symbol does not require decoding the any encoded symbol upon which the transformation is performed, where each encoded symbol value corresponds to a corresponding unencoded value, where at least two of the encoded symbol values are indistinguishable; the system generating a derived data dictionary (DDD) table, where the DDD table is initialized as a duplicate of a cross-reference table created and maintained by the DBMS, where each entry of the cross-reference table associates one distinct symbol value stored in the column with that distinct symbol's corresponding unencoded value, where each entry of the DDD table associates one distinct encoded symbol value stored in the column with a corresponding transformation result, and where a value of a first transformation result is equal to an output value produced by performing the expressly requested transformation upon a value of a first encoded symbol associated by the DDD table with the first transformation result; the system selecting a DDD-generation strategy from the group consisting of: initially populating the DDD table, before fetching any symbols from the column, by adding to each table entry a transformation result derived by performing the transformation upon a symbol value identified by the entry, where the initially further comprises: performing the transformation on each distinct symbol value listed in the DDD table; and storing a result of each transformation as a transformation result in a DDD table entry that identifies a corresponding value upon which the transformation was performed; and incrementally populating the DDD table by adding, to a table entry that identifies a value of a fetched symbol, a transformation result derived by performing the transformation upon the identified symbol value only if a value of the fetched symbol is not equal to a value of any previously fetched symbol, where the incrementally populating further comprises: fetching the first symbol from the database column; determining whether an existing DDD table entry associates a value of the first symbol with the first transformation result; and if determining that no DDD table entry associates the value of the first symbol with the first transformation result, performing the transformation on the value of the first symbol and storing the result of the performed transformation as the first transformation result in the DDD table entry associated with the first symbol; and the system returning, for each symbol stored in the column, a corresponding returned result, where a returned result corresponding to a particular symbol is equal to a transformation result associated, by a DDD table entry, with a value of the particular symbol. 6. The method of claim 5 , where the initially populating further comprises: performing the transformation on each distinct symbol value listed in the DDD table; and storing a result of each transformation as a transformation result in a DDD table entry that identifies a corresponding value upon which the transformation was performed. 7. The method of claim 5 , where the selecting a DDD-generation strategy comprises comparing a total number of symbols stored in the column to a total number of distinct sy

Assignees

Inventors

Classifications

  • Coding table adaptation · CPC title

  • using table look-up for the coding or decoding process, e.g. using read-only memory {(H03M7/4006 takes precedence)} · CPC title

  • Ensuring data consistency and integrity · CPC title

  • H03M7/4037Primary

    Prefix coding · CPC title

  • Column-oriented storage; Management thereof · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11042544B2 cover?
A database-management system evaluates a query that retrieves and transforms encoded symbols stored in a database. If the stored symbols assume a relatively small set of distinct values, the system initially performs the transformation on every value in the set. During execution of subsequent queries, rather than performing the transformation upon every stored symbol fetched from the database, …
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification H03M7/4037. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Jun 22 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).