System and method for data redistribution in a database

US11334422B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11334422-B2
Application numberUS-202016737534-A
CountryUS
Kind codeB2
Filing dateJan 8, 2020
Priority dateAug 3, 2016
Publication dateMay 17, 2022
Grant dateMay 17, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for data redistribution of a job data in a first datanode (DN) to at least one additional DN in a Massively Parallel Processing (MPP) Database (DB) is provided. The method includes recording a snapshot of the job data, creating a first data portion in the first DN and a redistribution data portion in the first DN, collecting changes to a job data copy stored in a temporary table, and initiating transfer of the redistribution data portion to the at least one additional DN.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for data redistribution of job data in a source data node (DN) to at least one destination DN in a Massively Parallel Processing (MPP) Database (DB), comprising: recording a snapshot of the job data; splitting the job data into a plurality of data portions, the data portions comprising a first data portion and a second data portion, and the snapshot comprising information about the split of the job data; collecting changes to a job data copy stored in a temporary table; identifying one or more first changes to the first data portion and identifying one or more second changes to the second data portion from the collected changes based on the snapshot; initiating transfer of the second data portion to the at least one destination DN; merging the identified first changes into the first data portion; and merging the identified second changes into the second data portion. 2. The method of claim 1 , wherein the temporary table is located in a coordinator node (CN) of the MPP DB. 3. The method of claim 1 , wherein the method further comprises: indicating that the source DN can continue processing; and indicating that the at least one destination DN is ready to begin processing. 4. The method of claim 1 , wherein merging the identified first changes into the first data portion and merging the identified second changes into the second data portion comprises: in response to success of transferring the second data portion to the at least one destination DN, merging the identified first changes to the first data portion from the temporary table into the first data portion in the source DN, and merging the identified second changes to the second data portion from the temporary table into the second data portion in the at least one destination DN. 5. The method of claim 1 , wherein merging the identified first changes into the first data portion and the merging the identified second changes into the second data portion comprises: in response to failure of transferring the second data portion to the at least one destination DN, merging the identified first changes to the first data portion from the temporary table into the first data portion in the source DN, and merging the identified second changes to the second data portion from the temporary table into the second data portion in the source DN. 6. The method of claim 5 , further comprising: indicating that the source DN is ready to re-initiate processing in the source DN. 7. The method of claim 5 , further comprising: re-trying the data redistribution. 8. The method of claim 1 , further comprising: in response to failure of transferring the redistribution second data portion to the at least one destination DN, re-trying the data redistribution. 9. The method of claim 1 , wherein the snapshot comprises index ranges of the job data or of a portion or portions of the job data. 10. The method of claim 1 , wherein the first data portion and the second data portion are of equal sizes. 11. A device for data redistribution of job data in a source data node (DN) to at least one destination DN in a Massively Parallel Processing (MPP) Database (DB), comprising: a non-transitory memory storage comprising instructions; and one or more processors in communication with the memory storage, wherein the one or more processors execute the instructions to: record a snapshot of the job data; split the job data into a plurality of data portions, the data portions comprising a first data portion and a second data portion, and the snapshot comprising information about the split of the job data; collect changes to a job data copy stored in a temporary table; identify one or more first changes to the first data portion and identify one or more second changes to the second data portion from the collected changes based on the snapshot; initiate transfer of the second data portion to the at least one destination DN; merge the identified first changes into the first data portion; and merge the identified second changes into the second data portion. 12. The device of claim 11 , wherein the temporary table is located in a coordinator node (CN) of the MPP DB. 13. The device of claim 11 , wherein the one or more processors is further configured to: indicate the source DN can continue processing; and indicate that the at least one destination DN is ready to begin processing. 14. The device of claim 11 , wherein the one or more processors further execute the instructions to: in response to success of transferring the second data portion to the at least one destination DN, merging the identified first changes to the first data portion from the temporary table into the first data portion in the source DN, and merging the identified second changes to the second data portion from the temporary table into the second data portion in the at least one destination DN. 15. The device of claim 11 , wherein the one or more processors further execute the instructions to: in response to failure of transferring the second data portion to the at least one destination DN, merging the identified first changes to the first data portion from the temporary table into the first data portion in the source DN, and merging the identified second changes to the second data portion from the temporary table into the second data portion in the source DN. 16. The device of claim 15 , wherein the one or more processors further execute the instructions to: indicate that the source DN is ready to re-initiate processing in the source DN. 17. The device of claim 11 , wherein the snapshot comprises index ranges of the job data or of a portion or portions of the job data. 18. The device of claim 11 , wherein the first data portion and the second data portion are of equal sizes. 19. A non-transitory computer-readable storage medium, comprising instructions which, when executed by at least one processor, causes the processor to perform the following steps: record a snapshot of a job data in a source data node (DN); split the job data into a plurality of data portions, the data portions comprising a first data portion and a second data portion, and the snapshot comprising information about the split of the job data; collect changes to a job data copy stored in a temporary table; identify one or more first changes to the first data portion and identifying one or more second changes to the second data portion from the collected changes based on the snapshot; initiate transfer of the second data portion to at least one destination DN; merge the identified first changes into the first data portion; and merge the identified second changes into the second data portion. 20. The non-transitory computer-readable storage medium of claim 19 , wherein the snapshot comprises index ranges of the job data or of a portion or portions of the job data.

Assignees

Inventors

Classifications

  • Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor · CPC title

  • Remedial or corrective actions (recovery from an exception in an instruction pipeline G06F9/3861; by retry G06F11/1402; for recovering from a failure of a protocol instance or entity H04L69/40) · CPC title

  • Using snapshots, i.e. a logical point-in-time copy of the data · CPC title

  • Integrating or interfacing systems involving database management systems · CPC title

  • Database-specific techniques · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11334422B2 cover?
A method for data redistribution of a job data in a first datanode (DN) to at least one additional DN in a Massively Parallel Processing (MPP) Database (DB) is provided. The method includes recording a snapshot of the job data, creating a first data portion in the first DN and a redistribution data portion in the first DN, collecting changes to a job data copy stored in a temporary table, and i…
Who is the assignee on this patent?
Futurewei Technologies Inc
What technology area does this patent fall under?
Primary CPC classification G06F11/0793. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 17 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).