Data placement for a distributed database

US11061871B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11061871-B2
Application numberUS-201916355112-A
CountryUS
Kind codeB2
Filing dateMar 15, 2019
Priority dateMar 15, 2019
Publication dateJul 13, 2021
Grant dateJul 13, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

According to an aspect, a method for data placement in a distributed database includes obtaining access pattern information relating to end clients that requested access to data stored in a first regional quorum of replicas located within a first region, where the first regional quorum includes a first lead replica. The method includes identifying a placement algorithm from a configuration file associated with the distributed database, and executing the placement algorithm to generate a suggested placement for the data based on the obtained access pattern information, where the suggested placement includes a second regional quorum of replicas located in a second region different than the first region, and the second regional quorum includes a second lead replica. The method includes transmitting a migration request to the distributed database to transfer the data from the first regional quorum to the second regional quorum.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for data placement in a distributed database, the method comprising: generating access pattern information based on accesses received at the distributed database from one or more end clients that requested access to a first row of data stored in a first regional quorum of replicas located within a first geographical region, the first regional quorum of replicas including a first lead replica; selecting a placement algorithm from a configuration file associated with the distributed database, the configuration file defining a data center mapping that maps data centers to geographical regions; updating the data center mapping based on network weights, the network weights being determined based on, at least in part, times associated with signals being transmitted from one or more data centers to other data centers identified in a respective geographical region; executing the placement algorithm to generate a suggested placement for the first row of data using the access pattern information, the suggested placement identifying second regional quorum of replicas located in a second geographical region different than the first geographical region, the second regional quorum of replicas including a second lead replica; and transmitting a migration request to the distributed database to transfer the first row of data from the first regional quorum of replicas to the second regional quorum of replicas. 2. The method of claim 1 , wherein the configuration file defines a plurality of database digests, each of the database digests defining a separate grouping of data, the configuration file defining, for each database digest, one or more regional quorums of replicas that are configured to store data associated with a respective database digest. 3. The method of claim 1 , wherein the configuration file identifies one or more first algorithm parameters associated with a first database digest, and one or more second algorithm parameters associated with a second database digest, wherein, in response to the first row of data being associated with the first database digest, the placement algorithm is executed using the one or more first algorithm parameters. 4. The method of claim 3 , further comprising: determining, in an offline analysis, the one or more first algorithm parameters associated with the first database digest based on historical access to data associated with the first database digest; and storing the one or more first algorithm parameters for the first database digest in the configuration file. 5. The method of claim 1 , wherein the placement algorithm is configured to determine a number of accesses that are received at the first regional quorum of replicas within a first period of time, and identify the second regional quorum of replicas as the suggested placement in response to the number of accesses being greater than a threshold value and determining that the data has not been migrated before a second period of time. 6. The method of claim 5 , wherein the number of accesses, the first period of time, and the second period of time are stored as algorithm parameters for the placement algorithm in the configuration file. 7. The method of claim 5 , wherein the number of accesses, the first period of time, and the second period of time are values that are optimized using machine-learning resources in an offline analysis. 8. The method of claim 1 , wherein the placement algorithm is configured to identify the second regional quorum of replicas based on location information of the one or more end clients, the location information identifying one or more locations within the second geographical region, the location information including at least one of global positioning system (GPS) information or Internet protocol (IP) address information. 9. The method of claim 1 , further comprising: determining that the first row of data is associated with a first database digest defined in the configuration file, the configuration file identifying a first placement algorithm associated with the first database digest and a second placement algorithm associated with a second database digest, wherein the first placement algorithm is selected from the configuration file in response to the first row of data being determined as associated with the first database digest. 10. An apparatus for data placement comprising: a distributed database having a first regional quorum of replicas with a first lead replica located within a first geographical region, and a second regional quorum of replicas with a second lead replica located within a second geographical region, the distributed database storing a first row of data at the first regional quorum of replicas; and at least one processor configured to implement: a data placement system configured to communicate with the distributed database, the data placement system configured to generate access pattern information based on accesses received at the distributed database from one or more end clients that requested access to the first row of data, the data placement system including: a configuration file defining one or more database digests, the configuration file identifying a placement algorithm for each database digest, each database digest defining a separate grouping of data, the configuration file defining a data center mapping that maps data centers to geographical regions; a place detector configured to cause the at least one processor to update the data center mapping based on network weights, the network weights being determined based on, at least in part, times associated with signals being transmitted from one or more data centers to other data centers identified in a respective geographical region; a place assigner configured to cause the at least one processor to-select the placement algorithm for the first row of data from the configuration file and execute the placement algorithm to identify a suggested placement using the access pattern information, the suggested placement identifying the second regional quorum of replicas, and a place migrator configured to cause the at least one processor to transmit a migration request to the distributed database to transfer the first row of data from the first regional quorum of replicas to the second regional quorum of replicas. 11. The apparatus of claim 10 , wherein the configuration file defines, for each database digest, one or more regional quorums of replicas that are configured to store data for a respective database digest, each regional quorum of replicas being identified by one or more data centers that include the replicas of a respective regional quorum of replicas. 12. The apparatus of claim 11 , wherein the second geographical region is located on a different continent than the first geographical region. 13. The apparatus of claim 11 , wherein the at least one processor is further configured to implement: an auxiliary file system configured to store historical accesses to data of a database digest; and a placement algorithm analyzer configured to execute, in an offline analysis, one or more simulations on the historical accesses to determine one or more algorithm parameters of the placement algorithm. 14. The apparatus of claim 11 , wherein the network weights include network round-trip time (RTT). 15. The apparatus of claim 11 , wherein the placement algorithm is configured to cause the at least one processor to determine a number of accesses that are received at the second quorum of replicas within a first period of time, and identify the second regional quorum of replicas as the sugges

Assignees

Inventors

Classifications

  • Database migration support · CPC title

  • G06F16/27Primary

    Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor · CPC title

  • in relation to response time · CPC title

  • Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS] · CPC title

  • Data partitioning, e.g. horizontal or vertical partitioning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11061871B2 cover?
According to an aspect, a method for data placement in a distributed database includes obtaining access pattern information relating to end clients that requested access to data stored in a first regional quorum of replicas located within a first region, where the first regional quorum includes a first lead replica. The method includes identifying a placement algorithm from a configuration file…
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G06F16/27. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 13 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).