Switching between a non-partitioned hash join and a partitioned hash join based on an amount of available memory

US10157193B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10157193-B2
Application numberUS-201615060086-A
CountryUS
Kind codeB2
Filing dateMar 3, 2016
Priority dateMar 3, 2016
Publication dateDec 18, 2018
Grant dateDec 18, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method implemented by at least one processing device, a processing device, and a computer program product are provided for adjusting hash partitions for a hash join operation. An amount of rows within each of an initial number of hash partitions is determined during assignment of respective rows to respective hash partitions. The initial number of hash partitions is adjusted to a final number of hash partitions based on the amount of rows within each of the initial number of hash partitions and an amount of available memory. The adjusting further includes determining the final number of hash partitions based on the amount of rows within each of the initial number of hash partitions and the amount of available memory, and assigning the rows to the final number of hash partitions. The hash join operation is then performed.

First claim

Opening claim text (preview).

We claim as our invention: 1. A processing device comprising: at least one processor; a memory; and a communication bus connecting the at least one processor with the memory, wherein the memory has stored therein instructions, which when executed by the at least one processor cause the processing device to perform a method comprising: creating simulated hash partitions, each hash partition of an initial number of hash partitions corresponding to a respective plurality of simulated hash partitions; determining an amount of rows of each of the simulated hash partitions; determining an amount of rows within each respective hash partition of the initial number of hash partitions, the determined amount of rows of the each respective hash partition being based on partitioning inner leg rows of a hash join operation into the initial number of hash partitions, the initial number of hash partitions being based on static information and a historical amount of available memory; adjusting the initial number of hash partitions to a final number of hash partitions based on the amount of rows within each of the initial number of hash partitions and an amount of available memory at runtime, the adjusting further comprising: determining the final number of hash partitions based on the amount of rows within each of the initial number of hash partitions and the amount of available memory at runtime, the determining the final number of hash partitions further comprising: when each of the initial number of hash partitions fits into the amount of available memory at runtime, performing:  combining ones of the initial number of hash partitions when the combined ones of the initial number of hash partitions are determined to fit into the amount of available memory at runtime, resulting in the final number of hash partitions being less than the initial number of hash partitions, thereby increasing processing performance during the hash join operation by decreasing a frequency at which respective hash partitions are loaded into the memory; and when at least one of the hash partitions does not fit into the amount of available memory at runtime, performing:  increasing an amount of hash partitions to the final number of hash partitions, the final number being larger than a current number of hash partitions so that each of the final number of hash partitions fits into the amount of available memory at runtime; and performing the hash join operation with the final number of hash partitions. 2. The processing device of claim 1 , wherein the determined amount of rows of the each respective hash partition is based on the determined amount of rows of the corresponding simulated hash partitions. 3. The processing device of claim 1 , wherein an amount of rows corresponding to each of the final number of hash partitions is based on the determined amount of rows of each of the corresponding simulated hash partitions. 4. The processing device of claim 2 , wherein the final number of hash partitions is one hash partition. 5. The processing device of claim 1 , wherein each of the final number of hash partitions includes a respective distinct hash table. 6. A computer program product comprising: a computer readable storage medium having computer readable program code embodied therewith for execution on a processing system, the computer readable program code being configured to be executed by the processing system to: create simulated hash partitions, each hash partition of an initial number of hash partitions corresponding to a respective plurality of simulated hash partitions; determining an amount of rows of each of the simulated hash partitions; determine an amount of rows within each respective hash partition of the initial number of hash partitions, the determined amount of rows of the each respective hash partition being based on partitioning inner leg rows of a hash join operation into the initial number of hash partitions, the initial number of hash partitions being based on static information and a historical amount of available memory; adjust the initial number of hash partitions to a final number of hash partitions based on the amount of rows within each of the initial number of hash partitions and an amount of available memory at runtime, the adjusting further comprising: determine the final number of hash partitions based on the amount of rows within each of the initial number of hash partitions and the amount of available memory at runtime, the determining the final number of hash partitions further comprising: when each of the initial number of hash partitions fits into the amount of available memory at runtime:  combine ones of the initial number of hash partitions when the combined ones of the initial number of hash partitions are determined to fit into the amount of available memory at runtime, resulting in the final number of hash partitions being less than the initial number of hash partitions, thereby increasing processing performance during the hash join operation by decreasing a frequency at which respective hash partitions are loaded into the memory; and when at least one of the hash partitions does not fit into the amount of available memory at runtime:  increase an amount of hash partitions to the final number of hash partitions, the final number being larger than a current number of hash partitions so that each of the final number of hash partitions fits into the amount of available memory at runtime; and perform the hash join operation with the final number of hash partitions. 7. The computer program product of claim 6 , wherein the determined amount of rows of the each respective hash partition is based on the determined amount of rows of the corresponding simulated hash partitions. 8. The computer program product of claim 6 , wherein an amount of rows corresponding to each of the final number of hash partitions is based on the determined amount of rows of each of the corresponding simulated hash partitions. 9. The computer program product of claim 6 , wherein each of the final number of hash partitions includes a respective distinct hash table.

Assignees

Inventors

Classifications

  • Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor · CPC title

  • Join operations · CPC title

  • Physics · mapped topic

  • Physics · mapped topic

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10157193B2 cover?
A method implemented by at least one processing device, a processing device, and a computer program product are provided for adjusting hash partitions for a hash join operation. An amount of rows within each of an initial number of hash partitions is determined during assignment of respective rows to respective hash partitions. The initial number of hash partitions is adjusted to a final number…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F16/2456. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 18 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).