Redistributing table data in a database cluster

US11151111B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11151111-B2
Application numberUS-201715827660-A
CountryUS
Kind codeB2
Filing dateNov 30, 2017
Priority dateNov 30, 2017
Publication dateOct 19, 2021
Grant dateOct 19, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer-implemented method of relocating data in a distributed database comprises: creating, by one or more processors, a second table in the distributed database, the second table including all columns from a first table; copying, by the one or more processors, a first set of tuples from the first table to the second table; modifying, by the one or more processors, during the copying of the first set of tuples, data of the first table according to a modification; after the copying of the first set of tuples, modifying, by the one or more processors, data of the second table according to the modification; and switching, by the one or more processors, the second table for the first table in a catalog of the distributed database.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method of relocating data among nodes in a distributed database, the method comprising: creating, by one or more processors, a second table on a second node in the distributed database, the second table including all columns from a first table on a first node in the distributed database; copying, by the one or more processors, a first set of tuples from the first table to the second table; receiving, by the one or more processors, a user change to data of the first table, the user change including a modification operation including updating an existing tuple in the first table by appending an updated existing tuple to the data of the first table and deleting the existing tuple in the first table; appending, by the one or more processors, the updated existing tuple to the data of the second table; deleting the existing tuple in the second table; and switching, by the one or more processors, the second table for the first table in a catalog of the distributed database, whereby a subsequent modification operation intended for the first table is executed against the second table. 2. The method of claim 1 , further comprising: distributing data for the first table among nodes of the distributed database based on a first hash function; and wherein the copying of the first set of tuples from the first table to the second table comprises distributing data for the first table among nodes of the distributed database based on a second hash function different from the first hash function. 3. The method of claim 1 , wherein the creating of the second table comprises creating the second table with an additional column not from the first table, the additional column containing a unique identifier for each tuple in the second table. 4. The method of claim 3 , further comprising: prior to the switching of the second table for the first table in the catalog, dropping the additional column. 5. The method of claim 1 , further comprising: creating a third table to track deletions from the first table; receiving a request to delete a first tuple from the first table; responsive to the request, adding a second tuple to the third table that includes an identifier of the second tuple; and deleting the first tuple from the data of the first table. 6. The method of claim 5 , further comprising: based on the identifier included in the second tuple of the third table and a third tuple of the second table, deleting the third tuple from the second table. 7. The method of claim 1 , further comprising: creating a third table to track deletions from the first table; receiving a request to update a first tuple of the first table, the request including modification data; responsive to the request: adding a second tuple to the third table that includes an identifier of the first tuple; and appending a third tuple to the first table that includes the modification data; based on the identifier included in the second tuple of the third table and a fourth tuple of the second table, deleting the fourth tuple from the second table; and appending a fifth tuple to the second table that includes the modification data. 8. The method of claim 1 , further comprising: before the switching of the second table for the first table in the catalog of the distributed database, locking the first table. 9. The method of claim 8 , further comprising: determining a number of modification operations remaining to be applied to the second table; and wherein the locking of the first table is based on the number of modification operations and a predetermined threshold. 10. The method of claim 1 , further comprising: determining a number of modification operations remaining to be applied to the second table; and based on the number of modification operations remaining to be applied to the second table and a predetermined threshold, applying at least a subset of the modification operations to the second table without locking the first table. 11. A distributed database, comprising: a plurality of data storage nodes; a memory storage comprising instructions; and one or more processors in communication with the memory and with the plurality of data storage nodes, wherein the one or more processors execute the instructions to perform: creating a second table on a second node in the distributed database, the second table including all columns from a first table on a first node in the distributed database; copying a first set of tuples from the first table to the second table; receiving a user change to data of the first table, the user change including a modification operation including updating an existing tuple in the first table by appending an updated existing tuple to the data of the first table and deleting the existing tuple in the first table; appending, by the one or more processors, the updated existing tuple to the data of the second table; deleting the existing tuple in the second table; switching the second table for the first table in a catalog of the distributed database, whereby a subsequent modification operation intended for the first table is executed against the second table. 12. The distributed database of claim 11 , wherein the one or more processors further perform: distributing data for the first table among nodes of the distributed database based on a first hash function; and wherein the copying of the first set of tuples from the first table to the second table comprises distributing data for the first table among nodes of the distributed database based on a second hash function different from the first hash function. 13. The distributed database of claim 11 , wherein the creating of the second table comprises creating the second table with an additional column not from the first table, the additional column containing a unique identifier for each tuple in the second table. 14. The distributed database of claim 13 , wherein the one or more processors further perform: prior to the switching of the second table for the first table in the catalog, dropping the additional column. 15. The distributed database of claim 11 , wherein the one or more processors further perform: creating a third table to track deletions from the first table; receiving a request to delete a first tuple from the first table; responsive to the request, adding a second tuple to the third table that includes an identifier of the second tuple; and deleting the first tuple from the data of the first table. 16. The distributed database of claim 15 , further comprising: based on the identifier included in the second tuple of the third table and a third tuple of the second table, deleting the third tuple from the second table. 17. The distributed database of claim 11 , wherein the one or more processors further perform: creating a third table to track deletions from the first table; receiving a request to update a first tuple of the first table, the request including modification data; responsive to the request: adding a second tuple to the third table that includes an identifier of the first tuple; and appending a third tuple to the first table that includes the modification data; based on the identifier included in the second tuple of the third table and a fourth tuple of the second table, deleting the fourth tuple from the second table; and appending a fifth tuple to the second table that includes the modification data. 18. A non-transitory computer-readable medium storing computer instructions for relocati

Assignees

Inventors

Classifications

  • Sequence data queries, e.g. querying versioned data · CPC title

  • Tablespace storage structures; Management thereof · CPC title

  • Data partitioning, e.g. horizontal or vertical partitioning · CPC title

  • Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor · CPC title

  • Clustering or classification · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11151111B2 cover?
A computer-implemented method of relocating data in a distributed database comprises: creating, by one or more processors, a second table in the distributed database, the second table including all columns from a first table; copying, by the one or more processors, a first set of tuples from the first table to the second table; modifying, by the one or more processors, during the copying of the…
Who is the assignee on this patent?
Futurewei Technologies Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/2282. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 19 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).