Method and system for outer join of database tables

US10885035B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10885035-B2
Application numberUS-202016877394-A
CountryUS
Kind codeB2
Filing dateMay 18, 2020
Priority dateApr 26, 2019
Publication dateJan 5, 2021
Grant dateJan 5, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The specification provides a system, method, and storage medium for executing an outer join command on a first table and a second table in a distributed database. The method may comprise: broadcasting the first table to a plurality of second nodes storing the second table; selecting one or more second nodes from the plurality of second nodes; obtaining, by each of the selected one or more second nodes, an outer join result after executing an outer join operation; obtaining, by each of the plurality of second nodes other than the selected one or more second nodes, an inner join result; removing a row with (1) a join key value and (2) a NULL value from the outer join result when the inner join result comprises another row with the join key; and returning, after the removal, a result to the outer join command by aggregating the outer join result and the inner join result.

First claim

Opening claim text (preview).

The invention claimed is: 1. A computer-implemented method for executing an OUTER JOIN command on a first table and a second table in a distributed database, wherein the method comprises: broadcasting the first table to a plurality of database nodes of a distributed database storing the second table, wherein the second table is a distributed database table, and each of the plurality of database nodes stores a portion of the second table; dividing the plurality of database nodes storing the second table into a first subset and a second subset, wherein the first subset and the second subset each comprises less than all of the plurality of database nodes; obtaining, by each database node of the first subset, an OUTER JOIN result by executing an OUTER JOIN operation based on the first table broadcasted to the each database node of the first subset and the portion of the second table stored in the each database node of the first subset; obtaining, by each database node of the second subset, an INNER JOIN result by executing an INNER JOIN operation based on the first table broadcasted to the each database node of the second subset and the portion of the second table stored in the each database node of the second subset identifying and removing a first row with a NULL value from the OUTER JOIN result when the INNER JOIN result comprises a second row sharing a same join key with the first row; and returning a result to the OUTER JOIN command by aggregating the OUTER JOIN result from each database node of the first subset and the INNER JOIN result from each database node of the second subset. 2. The method according to claim 1 , wherein the identifying and removing a first row with a NULL value from the OUTER JOIN result when the INNER JOIN result comprises a second row sharing a same join key with the first row comprises: performing a GROUP BY operation to the OUTER JOIN result and the INNER JOIN result based on one or more join keys of the OUTER JOIN command. 3. The method according to claim 1 , wherein the dividing the plurality of database nodes storing the second table into a first subset and a second subset comprises: selecting one or more of the plurality of database nodes storing a-smallest portions of the second table as the first subset. 4. The method according to claim 1 , wherein the dividing the plurality of database nodes storing the second table into a first subset and a second subset comprises: selecting one or more of the plurality of database nodes with a fastest connection as the first subset. 5. The method according to claim 1 , wherein the first table is a distributed database table stored in a plurality of second database nodes, and the broadcasting the first table to a plurality of database nodes storing the second table comprises: broadcasting data from each database node of the plurality of second database nodes to each database node of the plurality of database nodes. 6. The method according to claim 1 , wherein the executing the OUTER JOIN operation based on the first table broadcasted to the each database node of the first subset and the portion of the second table stored in the each database node of the first subset comprises: performing the OUTER JOIN operation by using the first table broadcasted to the each database node of the first subset as an outer table and the portion of the second table stored on the each database node of the first subset as an inner table. 7. The method according to claim 1 , wherein: the OUTER JOIN operation executed by each of database nodes of the first subset is based on the one or more join keys of the OUTER JOIN command; and the INNER JOIN operation executed by each of database nodes of the second subset is based on the one or more join keys of the OUTER JOIN command. 8. A system for executing an outer join command on a first table and a second table in a distributed database, comprising a processor and a non-transitory computer-readable storage medium storing instructions executable by the processor to cause the system to perform operations comprising: broadcasting the first table to a plurality of database nodes of a distributed database storing the second table, wherein the second table is a distributed database table, and each of the plurality of database nodes stores a portion of the second table; dividing the plurality of database nodes storing the second table into a first subset and a second subset, wherein the first subset and the second subset each comprises less than all of the plurality of database nodes; obtaining, by each database node of the first subset, an OUTER JOIN result by executing an OUTER JOIN operation based on the first table broadcasted to the each database node of the first subset and the portion of the second table stored in the each database node of the first subset; obtaining, by each database node of the second subset, an INNER JOIN result by executing an INNER JOIN operation based on the first table broadcasted to the each database node of the second subset and the portion of the second table stored in the each database node of the second subset; identifying and removing a first row with a NULL value from the OUTER JOIN result when the INNER JOIN result comprises a second row sharing a same join key with the first row; and returning a result to the OUTER JOIN command by aggregating the OUTER JOIN result from each database node of the first subset and the INNER JOIN result from each database node of the second subset. 9. The system of claim 8 , wherein the identifying and removing a first row with a NULL value from the OUTER JOIN result when the INNER JOIN result comprises a second row sharing a same join key with the first row comprises: performing a GROUP BY operation to the OUTER JOIN result and the INNER JOIN result based on one or more join keys of the OUTER JOIN command. 10. The system of claim 8 , wherein the dividing the plurality of database nodes storing the second table into a first subset and a second subset comprises: selecting one or more of the plurality of database nodes storing a-smallest portions of the second table as the first subset. 11. The system of claim 8 , wherein the dividing the plurality of database nodes storing the second table into a first subset and a second subset comprises: selecting one or more of the plurality of database nodes with a fastest connection as the first subset. 12. The system of claim 8 , wherein the first table is a distributed database table stored in a plurality of second database nodes, and the broadcasting the first table to a plurality of database nodes storing the second table comprises: broadcasting data from each database node of the plurality of second database nodes to each database node of the plurality of database nodes. 13. The system of claim 8 , wherein the executing the OUTER JOIN operation based on the first table broadcasted to the each database node of the first subset and the portion of the second table stored in the each database node of the first subset comprises: performing the OUTER JOIN operation by using the first table broadcasted to the each database node of the first subset as an outer table and the portion of the second table stored on the each database node of the first subset as an inner table. 14. A non-transitory computer-readable storage medium for executing an outer join command on a first table and a second table in a distributed database, configured with instructions executable by one or more processors to cause the one or more processors to perform operations comprising: broadcasting the first table to a plurality of database nodes of a

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10885035B2 cover?
The specification provides a system, method, and storage medium for executing an outer join command on a first table and a second table in a distributed database. The method may comprise: broadcasting the first table to a plurality of second nodes storing the second table; selecting one or more second nodes from the plurality of second nodes; obtaining, by each of the selected one or more secon…
Who is the assignee on this patent?
Advanced New Technologies Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06F16/24544. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 05 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).