Distributed relational dictionaries

US10810195B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10810195-B2
Application numberUS-201815861212-A
CountryUS
Kind codeB2
Filing dateJan 3, 2018
Priority dateJan 3, 2018
Publication dateOct 20, 2020
Grant dateOct 20, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques related to distributed relational dictionaries are disclosed. In some embodiments, one or more non-transitory storage media store a sequence of instructions which, when executed by one or more computing devices, cause performance of a method. The method involves generating, by a query optimizer at a distributed database system (DDS), a query execution plan (QEP) for generating a code dictionary and a column of encoded database data. The QEP specifies a sequence of operations for generating the code dictionary. The code dictionary is a database table. The method further involves receiving, at the DDS, a column of unencoded database data from a data source that is external to the DDS. The DDS generates the code dictionary according to the QEP. Furthermore, based on joining the column of unencoded database data with the code dictionary, the DDS generates the column of encoded database data according to the QEP.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: generating, by a query optimizer at a distributed database system, a first query execution plan for generating a first code dictionary and a first column of encoded database data, said first query execution plan specifying a first sequence of operations for generating said first code dictionary, said first code dictionary being a database table; receiving, at said distributed database system, a first column of unencoded database data from a data source external to said distributed database system; generating, at said distributed database system, said first code dictionary according to said first query execution plan; based on joining said first column of unencoded database data with said first code dictionary, generating, at said distributed database system, said first column of encoded database data according to said first query execution plan; and wherein the method is performed by one or more computing devices. 2. The method of claim 1 , wherein said data source external to said distributed database system is a centralized database system. 3. The method of claim 1 , further comprising sending, to said data source external to said distributed database system, an indication that said first column of encoded database data has been generated according to said first query execution plan. 4. The method of claim 1 , further comprising, after generating said first column of encoded database data according to said first query execution plan, causing said first code dictionary to be defined by a database dictionary of said data source external to said distributed database system. 5. The method of claim 1 , further comprising, prior to generating said first query execution plan, making a cost-based decision to generate said first column of encoded database data at said distributed database system instead of at said data source external to said distributed database system. 6. The method of claim 5 , wherein said cost-based decision is based on at least one of a group comprising column characteristics, available memory on a particular node of said distributed database system, and cost of data transfer among nodes of said distributed database system. 7. The method of claim 1 , wherein said first sequence of operations comprises performing a de-duplication operation before performing a ranking operation. 8. The method of claim 1 , wherein said first sequence of operations comprises performing a ranking operation before performing a de-duplication operation. 9. The method of claim 1 , wherein said first sequence of operations comprises a partitioning operation for distributing unencoded database data across multiple cores of a node of said distributed database system. 10. The method of claim 1 , further comprising generating, by said query optimizer at said distributed database system, a second query execution plan for generating a second code dictionary and a second column of encoded database data, said second query execution plan specifying a second sequence of operations for generating said second code dictionary, said second sequence of operations being different from said first sequence of operations, said second code dictionary being a database table. 11. One or more non-transitory storage media storing a sequence of instructions which, when executed by one or more computing devices, cause: generating, by a query optimizer at a distributed database system, a first query execution plan for generating a first code dictionary and a first column of encoded database data, said first query execution plan specifying a first sequence of operations for generating said first code dictionary, said first code dictionary being a database table; receiving, at said distributed database system, a first column of unencoded database data from a data source external to said distributed database system; generating, at said distributed database system, said first code dictionary according to said first query execution plan; and based on joining said first column of unencoded database data with said first code dictionary, generating, at said distributed database system, said first column of encoded database data according to said first query execution plan. 12. The one or more non-transitory storage media of claim 11 , wherein said data source external to said distributed database system is a centralized database system. 13. The one or more non-transitory storage media of claim 11 , wherein said sequence of instructions further comprise instructions, which when executed by said one or more computing devices, cause sending, to said data source external to said distributed database system, an indication that said first column of encoded database data has been generated according to said first query execution plan. 14. The one or more non-transitory storage media of claim 11 , wherein said sequence of instructions further comprise instructions, which when executed by said one or more computing devices, cause, after generating said first column of encoded database data according to said first query execution plan, causing said first code dictionary to be defined by a database dictionary of said data source external to said distributed database system. 15. The one or more non-transitory storage media of claim 11 , wherein said sequence of instructions further comprise instructions, which when executed by said one or more computing devices, cause, prior to generating said first query execution plan, making a cost-based decision to generate said first column of encoded database data at said distributed database system instead of at said data source external to said distributed database system. 16. The one or more non-transitory storage media of claim 15 , wherein said cost-based decision is based on at least one of a group comprising column characteristics, available memory on a particular node of said distributed database system, and cost of data transfer among nodes of said distributed database system. 17. The one or more non-transitory storage media of claim 11 , wherein said first sequence of operations comprises performing a de-duplication operation before performing a ranking operation. 18. The one or more non-transitory storage media of claim 11 , wherein said first sequence of operations comprises performing a ranking operation before performing a de-duplication operation. 19. The one or more non-transitory storage media of claim 11 , wherein said first sequence of operations comprises a partitioning operation for distributing unencoded database data across multiple cores of a node of said distributed database system. 20. The one or more non-transitory storage media of claim 11 , wherein said sequence of instructions further comprise instructions which, when executed by said one or more computing devices, cause generating, by said query optimizer at said distributed database system, a second query execution plan for generating a second code dictionary and a second column of encoded database data, said second query execution plan specifying a second sequence of operations for generating said second code dictionary, said second sequence of operations being different from said first sequence of operations, said second code dictionary being a database table.

Assignees

Inventors

Classifications

  • for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS] · CPC title

  • Query optimisation · CPC title

  • Tablespace storage structures; Management thereof · CPC title

  • Column-oriented storage; Management thereof · CPC title

  • Dictionaries · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10810195B2 cover?
Techniques related to distributed relational dictionaries are disclosed. In some embodiments, one or more non-transitory storage media store a sequence of instructions which, when executed by one or more computing devices, cause performance of a method. The method involves generating, by a query optimizer at a distributed database system (DDS), a query execution plan (QEP) for generating a code…
Who is the assignee on this patent?
Oracle Int Corp
What technology area does this patent fall under?
Primary CPC classification H04L67/1097. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Oct 20 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).