Managing a distributed database across a plurality of clusters

US9607071B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9607071-B2
Application numberUS-201414200611-A
CountryUS
Kind codeB2
Filing dateMar 7, 2014
Priority dateMar 7, 2014
Publication dateMar 28, 2017
Grant dateMar 28, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A multi-cluster database management system is disclosed that distributes and manages data across a multi-cluster database through the use of cluster partitions. The multi-cluster database management system assigns cluster partitions to clusters of the multi-cluster database. The multi-cluster database management system can evenly or substantially evenly divide the cluster partitions and associated data among the clusters of the multi-cluster database. The multi-cluster database management system can scale in or out by adding or removing clusters from the multi-cluster database when needed or desired. Once a cluster is added or removed, the multi-cluster database management system re-balances the cluster partitions and the associated data across the clusters of the modified multi-cluster database.

First claim

Opening claim text (preview).

I claim: 1. A method of managing a multi-cluster database comprising: distributing a plurality of cluster partitions among a plurality of database clusters, each database cluster including a plurality of database nodes; allocating a node key space among the plurality of database nodes of each database cluster of the plurality of database clusters; storing data in the database nodes of the corresponding database clusters by identifying an assigned database cluster of the plurality of database clusters based on the plurality of cluster partitions and identifying an assigned node of the assigned database cluster based on the node key space; modifying the plurality of database clusters by adding or removing one or more database clusters; re-distributing the plurality of cluster partitions among the modified plurality of database clusters while maintaining the allocation of the node key space among the plurality of database nodes of each database cluster of the plurality of database clusters; and moving, by at least one processor, at least a portion of the data between the modified plurality of database clusters based on the re-distribution of the plurality of cluster partitions. 2. The method as recited in claim 1 , further comprising assigning key identifiers of a cluster key space to the cluster partitions of the plurality of cluster partitions. 3. The method as recited in claim 2 , further comprising mapping data to cluster partitions of the plurality of cluster partitions and corresponding database clusters by performing steps comprising: identifying an identifier associated with a piece of data; determining a key identifier of the cluster key space for the piece of data based on the identifier; identifying a cluster partition of the plurality of cluster partitions to which the key identifier of the cluster key space is assigned; and identifying a database cluster of the plurality of database clusters to which the identified cluster partition is assigned. 4. The method as recited in claim 3 , further comprising assigning key identifiers of the node key space to the plurality of database nodes of each database cluster. 5. The method as recited in claim 4 , further comprising mapping the piece of data to a database node of the identified database cluster, wherein mapping the piece of data to the database node of the identified database cluster comprises: determining a key identifier of the node key space for the piece of data based on the identifier; and identifying the database node of the identified database cluster to which the key identifier of the node key space is assigned. 6. The method as recited in claim 5 , wherein: determining the key identifier of the cluster key space for the piece of data based on the identifier comprises performing a hash on the identifier and calculating the key identifier of the cluster key space using a first set of bits of the hash; and determining the key identifier of the node key space for the piece of data based on the identifier comprises calculating the key identifier of the node key space using a second set of bits of the hash. 7. The method as recited in claim 5 , wherein re-distributing the plurality of cluster partitions among the modified plurality of database clusters comprises re-assigning the identified cluster partition from the identified database cluster to another database cluster of the plurality of database clusters. 8. The method as recited in claim 7 , wherein moving at least a portion of the data between the modified plurality of database clusters based on the re-distribution of the plurality of cluster partitions comprises moving the piece of data from the identified database node of the identified database cluster to another database node of the another database cluster. 9. The method as recited in claim 4 , wherein assigning key identifiers of the node key space to the plurality of database nodes of each database cluster comprises indirectly assigning the key identifiers of the node key space to the plurality of database nodes. 10. A system comprising: at least one processor; and at least one non-transitory computer readable storage medium storing instructions thereon that, when executed by the at least one processor, cause the system to: distribute a plurality of cluster partitions among a plurality of database clusters, each database cluster including a plurality of database nodes; allocate a node key space among the plurality of database nodes of each database cluster of the plurality of database clusters; store data in the database nodes of the corresponding database clusters by identifying an assigned database cluster of the plurality of database clusters based on the plurality of cluster partitions and identifying an assigned node of the assigned database cluster based on the node key space; modify the plurality of database clusters by adding or removing one or more database clusters; re-distribute the plurality of cluster partitions among the modified plurality of database clusters while maintaining the allocation of the node key space among the plurality of database nodes of each database cluster of the plurality of database clusters; and balance the data across the modified plurality of database clusters based on the re-distribution of the plurality of cluster partitions. 11. The system as recited in claim 10 , wherein the instructions, when executed by the at least one processor, cause the system to balance the data across the modified plurality of database clusters by moving at least a portion of the data between the database clusters of the modified plurality of database clusters based on the re-distribution of the plurality of cluster partitions. 12. The system as recited in claim 10 , wherein the instructions, when executed by the at least one processor, cause the system to map the data to cluster partitions of the plurality of cluster partitions and corresponding database clusters by: identifying an identifier associated with a piece of data; determining a key identifier of a cluster key space for the piece of data based on the identifier; identifying a cluster partition of the plurality of cluster partitions to which the key identifier of the cluster key space is assigned; identifying a database cluster of the plurality of database clusters to which the identified cluster partition is assigned; determining a key identifier of the node key space for the piece of data based on the identifier; and identifying a database node of the identified database cluster to which the key identifier of the node key space is assigned. 13. The system as recited in claim 12 , wherein the instructions, when executed by the at least one processor, further cause the system to: determine the key identifier of the cluster key space for the piece of data based on the identifier by performing a hash on the identifier and calculating the key identifier of the cluster key space using a first set of bits of the hash; and determine the key identifier of the node key space for the piece of data based on the identifier by calculating the key identifier of the node key space using a second set of bits of the hash. 14. A non-transitory computer readable medium storing instructions thereon that, when executed by at least one processor, cause a computer system to: distribute a plurality of cluster partitions among a plurality of database clusters, each database cluster including a plurality of database nodes; allocate a node key space among the plurality of database nodes of each database cluster of the plurality of database clusters; store data in the databa

Assignees

Inventors

Classifications

  • Search customisation based on user profiles and personalisation · CPC title

  • Hash tables · CPC title

  • using information identifiers, e.g. uniform resource locators [URL] · CPC title

  • Clustering; Classification · CPC title

  • G06F16/278Primary

    Data partitioning, e.g. horizontal or vertical partitioning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9607071B2 cover?
A multi-cluster database management system is disclosed that distributes and manages data across a multi-cluster database through the use of cluster partitions. The multi-cluster database management system assigns cluster partitions to clusters of the multi-cluster database. The multi-cluster database management system can evenly or substantially evenly divide the cluster partitions and associa…
Who is the assignee on this patent?
Adobe Systems Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/278. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 28 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).