System and method for massively parallel processor database

US9959332B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9959332-B2
Application numberUS-201514601679-A
CountryUS
Kind codeB2
Filing dateJan 21, 2015
Priority dateJan 21, 2015
Publication dateMay 1, 2018
Grant dateMay 1, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In one embodiment, a method includes determining a number of initial servers in a massively parallel processing (MPP) database cluster and determining an initial bucket configuration of the MPP database cluster, where the initial bucket configuration has a number of initial buckets. The method also includes adding a number of additional servers to the MPP database cluster to produce a number of updated servers, where the updated servers include the initial servers and the additional servers and creating an updated bucket configuration in accordance with the number of initial servers, the initial bucket configuration, and the number of additional servers, where the updated bucket configuration has a number of updated buckets. Additionally, the method includes redistributing data of the MPP cluster in accordance with the updated bucket configuration.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: determining a quantity of initial servers in a massively parallel processing (MPP) database cluster; determining a configuration of initial buckets of the MPP database cluster, wherein the configuration of initial buckets comprises a quantity of initial buckets; adding at least one additional server to the MPP database cluster to produce updated servers, wherein the updated servers comprise the initial servers and the at least one additional server; creating a configuration of updated buckets comprising the initial buckets in accordance with the quantity of initial servers, the configuration of initial buckets, and a quantity of additional servers, wherein the configuration of updated buckets identifies a subset of buckets of the initial buckets, with the subset of buckets being transmitted to the at least one additional server from the initial servers; and redistributing, based on the configuration of updated buckets, data from the initial servers to the at least one additional server, with the data being associated with the subset of buckets. 2. The method of claim 1 , wherein the configuration of updated buckets comprises a mapping of buckets to the updated servers, wherein each of the updated servers has either a minimum number of buckets or a maximum number of buckets, wherein the maximum number of buckets is one bucket more than the minimum number of buckets. 3. The method of claim 1 , wherein creating the configuration of updated buckets comprises determining whether the quantity of updated buckets is greater than the quantity of initial buckets. 4. The method of claim 3 , wherein determining whether the quantity of updated buckets is greater than the quantity of initial buckets comprises: determining a redistributed bucket configuration having the quantity of updated buckets and the quantity of initial buckets; determining a percentage load variation of the redistributed bucket configuration; determining whether the percentage load variation is acceptable; setting the quantity of updated buckets to be greater than the quantity of initial buckets when the percentage load variation is not acceptable; and setting the quantity of updated buckets to the quantity of initial buckets when the percentage load variation is acceptable. 5. The method of claim 1 , wherein the quantity of updated buckets is a power of two of the quantity of initial buckets. 6. The method of claim 1 , wherein the quantity of updated buckets is two times the quantity of initial buckets. 7. The method of claim 1 , wherein the quantity of initial buckets is a power of two. 8. The method of claim 1 , wherein redistributing data in a subset of buckets from the initial servers to the at least one additional server comprises: determining an initial bucket-server mapping; determining the subset of buckets to be moved from the initial servers to the at least one additional server; determining data associated with the subset of buckets to produce a subset of data; and redistributing the data associated with the subset of buckets from the initial servers to the at least one additional server. 9. The method of claim 1 , further comprising placing data on the initial servers in accordance with the configuration of initial buckets before adding the at least one additional server to the MPP database cluster. 10. The method of claim 9 , wherein placing the data on the initial servers comprises: determining a hash value for a row of a table; and determining a bucket associated with the row in accordance with the hash value. 11. A method comprising: determining an updated bucket-server mapping for a massively parallel processor (MPP) database cluster in accordance with a quantity of initial servers and a quantity of additional servers, wherein the updated bucket-server mapping identifies a subset of buckets of the initial buckets, with the subset of buckets being transmitted to the at least one additional server from the initial servers; determining whether a first table is to be redistributed in accordance with the updated bucket-server mapping and an initial bucket-server mapping; starting a first transaction when the first table is to be redistributed; performing the first transaction comprising redistributing data from an initial server of the initial servers to the additional servers, with the data being associated with the subset of buckets; and committing the first transaction after performing the first transaction. 12. The method of claim 11 , wherein performing the transaction further comprises: creating a temporary table in accordance with the updated bucket-server mapping; redistributing the data in accordance with the updated bucket-server mapping; and merging the temporary table and the first table after redistributing the data. 13. The method of claim 12 , wherein redistributing the data comprises: building delete statements in accordance with a difference between the updated bucket-server mapping and the initial bucket-server mapping; and issuing the delete statements for deleting records from the initial bucket-server mapping. 14. The method of claim 12 , wherein redistributing the data comprises: building insert statements in accordance with a difference between the updated bucket-server mapping and the initial bucket-server mapping; and issuing the insert statements for insert records which is deleted from the initial bucket-server mapping to the updated bucket-server mapping. 15. The method of claim 11 , further comprising: determining whether a second table is to be redistributed after committing the first transaction; starting a second transaction when the second table is to be redistributed; and removing the initial bucket-server mapping when the second table is not to be redistributed. 16. The method of claim 11 , further comprising creating a list of tables to be redistributed, wherein determining whether the first table is to be redistributed comprises determining whether the first table is to be redistributed in accordance with the list of tables. 17. The method of claim 11 , further comprising installing the additional servers before determining whether the first table is to be redistributed. 18. A computer comprising: a processor; and a non-transitory computer readable storage medium storing programming for execution by the processor, the programming including instructions to: determine a quantity of initial servers in a massively parallel processing (MPP) database cluster, determine a configuration of initial buckets of the MPP database cluster, wherein the configuration of initial buckets comprises a quantity of initial buckets, add at least one additional server to the MPP database cluster to produce updated servers, wherein the updated servers comprise the initial servers and the at least one additional server, create a configuration of updated buckets comprising the initial buckets in accordance with the quantity of initial servers, the configuration of initial buckets, and a quantity of additional servers, wherein the configuration of updated buckets identifies a subset of buckets of the initial buckets, with the subset of buckets being transmitted to the at least one additional server from the initial servers; and redistribute, based on the configuration of updated buckets, data from the initial servers to the at least one additional server, with the data being associated with the subset of buckets. 19. A computer comprising: a processor; and

Assignees

Inventors

Classifications

  • G06F16/27Primary

    Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor · CPC title

  • Parallel file systems, i.e. file systems supporting multiple processors · CPC title

  • Physics · mapped topic

  • Database tuning (G06F16/2282 takes precedence; database performance monitoring G06F11/3409) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9959332B2 cover?
In one embodiment, a method includes determining a number of initial servers in a massively parallel processing (MPP) database cluster and determining an initial bucket configuration of the MPP database cluster, where the initial bucket configuration has a number of initial buckets. The method also includes adding a number of additional servers to the MPP database cluster to produce a number of…
Who is the assignee on this patent?
Futurewei Technologies Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/27. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 01 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).