Mixed join of row and column database tables in native orientation

US9965500B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9965500-B2
Application numberUS-201113323530-A
CountryUS
Kind codeB2
Filing dateDec 12, 2011
Priority dateDec 12, 2011
Publication dateMay 8, 2018
Grant dateMay 8, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A mixed join between database column and row tables employs an algorithm that recognizes both row and column store, and is executable upon the data in its native form (row or column) without requiring conversion between orientations. The native mixed join algorithm accesses the column dictionary of the column table for efficient join processing. The native mixed join algorithm may also exploit an inverted index (if present) to search a record (e.g. docid) with a given value. In particular, the native mixed join algorithm looks up a column dictionary for a join condition, while iterating the row table and returning matched records in a pipelined manner.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: accessing in a non-transitory computer readable storage medium, a database created in an application level language and comprising a row store table and a column store table having a column dictionary; retrieving one or more records from the row store table; generating a value array from data in the records retrieved from the row store table; using the value array, accessing a plurality of value ids from the column dictionary of the column store table that correspond to values in the value array; using the value ids and the row store table, generating a vid-rid table comprising a column that contains the value ids from the column store table and a column that contains row ids of rows in the row store table that contain the values corresponding to the values ids; using the value ids, generating a doc id array of doc ids of records in the column store table that contain the value ids; and performing at least an inner join operation using the doc id array and row ids in the vid-rid table to produce a plurality of row id and doc id pairs. 2. The computer-implemented method of claim 1 further comprising referencing an inverted index of the column store table to obtain the join condition. 3. The computer-implemented method of claim 1 wherein the join condition comprises a foreign key join such that columns of the row store table do not have duplicate values. 4. The computer-implemented method of claim 1 wherein the join condition comprises N-to-M join, the method further comprising maintaining a hash table for iterated values from the row store table so as to avoid referencing the column dictionary for a same value. 5. The computer-implemented method of claim 1 wherein retrieving data from the column dictionary includes generating a join condition using data from the row store table and data from the column store table. 6. The computer-implemented method of claim 5 wherein generating a join condition includes creating a mapping between one or more row identifiers from the data from the row store table and data from the column store table. 7. The computer-implemented method of claim 1 wherein more than one record is retrieved from the row store table. 8. The computer-implemented method of claim 1 , wherein the method is performed in pipeline fashion. 9. A non-transitory computer readable storage medium embodying a computer program for performing a method, said method comprising: accessing in a non-transitory computer readable storage medium, a database created in an application level language and comprising a row store table and a column store table having a column dictionary; retrieving one or more records from the row store table; generating a value array from data in the records retrieved from the row store table; using the value array, accessing a plurality of value ids from the column dictionary of the column store table that correspond to values in the value array; using the value ids and the row store table, generating a vid-rid table comprising a column that contains the value ids from the column store table and a column that contains row ids of rows in the row store table that contain the values corresponding to the values ids; using the value ids, generating a doc id array of doc ids of records in the column store table that contain the value ids; and performing at least an inner join operation using the doc id array and row ids in the vid-rid table to produce a plurality of row id and doc id pairs. 10. The non-transitory computer readable storage medium of claim 9 further comprising code to reference an inverted index of the column store table to obtain the join condition. 11. The non-transitory computer readable storage medium of claim 9 wherein the join condition comprises a foreign key join such that columns of the row store table do not have duplicate values. 12. The non-transitory computer readable storage medium of claim 9 wherein the join condition comprises N-to-M join, the non-transitory computer readable storage medium further comprising code maintaining a hash table for iterated values from the row store table so as to avoid referencing the column dictionary for a same value. 13. The non-transitory computer readable storage medium of claim 9 , wherein the method is performed in pipeline fashion. 14. A computer system comprising: one or more processors; a software program, executable on said computer system, the software program configured to: access in a non-transitory computer readable storage medium, a database created in an application level language and comprising a row store table and a column store table having a column dictionary; retrieve one or more records from the row store table; generate a value array from data in the records retrieved from the row store table; use the value array to access a plurality of value ids from the column dictionary of the column store table that correspond to values in the value array; use the value ids and the row store table to generate a vid-rid table comprising a column that contains the value ids from the column store table and a column that contains row ids of rows in the row store table that contain the values corresponding to the values ids; use the value ids to generate a doc id array of doc ids of records in the column store table that contain the value ids; and perform at least an inner join operation using the doc id array and row ids in the vid-rid table to produce a plurality of row id and doc id pairs. 15. The computer system of claim 14 wherein the software code is configured to reference an inverted index of the column store table to obtain the join condition. 16. The computer system of claim 14 wherein the join condition comprises a foreign key join such that columns of the row store table do not have duplicate values. 17. The computer system of claim 14 wherein the join condition comprises an N-to-M join, the non-transitory computer readable storage medium further comprising code maintaining a hash table for iterated values from the row store table so as to avoid referencing the column dictionary for a same value. 18. The computer system of claim 14 , wherein the software program is configured to execute in pipeline fashion.

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9965500B2 cover?
A mixed join between database column and row tables employs an algorithm that recognizes both row and column store, and is executable upon the data in its native form (row or column) without requiring conversion between orientations. The native mixed join algorithm accesses the column dictionary of the column table for efficient join processing. The native mixed join algorithm may also exploit …
Who is the assignee on this patent?
Yoon Yongsik, Jeong Chanho, Cha Sang Kyun, and 1 more
What technology area does this patent fall under?
Primary CPC classification G06F16/221. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 08 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).