Replication lag-constrained deletion of data in a large-scale distributed data storage system

US2018336237A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2018336237-A1
Application numberUS-201815971792-A
CountryUS
Kind codeA1
Filing dateMay 4, 2018
Priority dateMay 22, 2017
Publication dateNov 22, 2018
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Computer-implemented techniques for replication-lag constrained deletion of data in a distributed data storage system. In some aspects, the techniques improve the operation of a computing system by preventing too high of a delete rate that causes severe replication lag while at the same time increasing and decreasing the delete rate over time to a maximum allowable delete rate constrained by measured replication lag in terms of both local replication lag and geographic replication lag. In one implementation, the delete rate is adjusted by increasing or decreasing a pause interval that determines how long a database data deletion process pauses between submitting database deletion commands to a database server.

First claim

Opening claim text (preview).

1 . A system, comprising: one or more processors; a memory; and one or more computer programs stored in the memory for execution by the one or more processors, the one or more computer programs comprising instructions configured to cause the system to perform operations comprising: determining a first replication lag metric based on a first replication process involving a first database and a second database, wherein determining the first replication lag metric is based on a measured time to replicate database data from the second database to the first database; determining a second replication lag metric based on a second replication process involving a third database and the second database, wherein determining the second replication lag metric is based on a measured time to replicate database data from the second database to the third database; based at least in part on comparing the first replication lag metric to a first replication lag threshold and comparing the second replication lag metric to a second replication lag threshold, causing a database server to delete data from the second database at an adjusted delete rate. 2 . The system of claim 1 , wherein the instructions are further configured for: based at least in part on comparing the first replication lag metric to the first replication lag threshold and comparing the second replication lag metric to the second replication lag threshold, adjusting a pause interval resulting in an adjusted pause interval; serially submitting a plurality of commands to the database server using the adjusted pause interval. 3 . The system of claim 2 , wherein the instructions are further configured for: based at least in part on a determining that both: (a) the first replication lag metric is above a respective threshold and (b) the second replication lag metric is below a respective threshold, determining the adjusted pause interval by increasing the pause interval. 4 . The system of claim 1 , wherein the instructions are further configured for: based at least in part on determining that both: (a) the second replication lag metric is below a respective threshold and (b) the first replication lag metric is below a respective threshold, serially submitting at least some commands of a plurality of commands to the database server to delete data from the second database without sleeping for a pause interval after a submission of a command of the plurality of commands. 5 . The system of claim 2 , wherein a command of the plurality of commands is a Structured Query Language (SQL) delete command. 6 . The system of claim 2 , wherein the instructions are further configured for determining the adjusted pause interval by decreasing the pause interval. 7 . The system of claim 2 , wherein the instructions are further configured for determining the adjusted pause interval by increasing the pause interval based, at least in part, on the determining that both: (a) the second replication lag metric is above the second replication lag threshold and (b) the first replication lag metric is above the first replication lag threshold. 8 . The system of claim 2 , wherein the instructions are further configured for selecting the pause interval as a maximum of the second replication lag metric and the first replication lag metric. 9 . The system of claim 2 , wherein the serially submitting the plurality of commands to the database server to delete data from the second database is based, at least in part, on not sleeping for a pause interval after a submission of a command of the plurality of commands. 10 . A method performed by a computing system comprising one or more processors and a memory, the method comprising: determining a replication lag metric based on a replication process involving a first database and a second database located at a geographic distance from the first database, wherein determining the replication lag metric is based on a measured time to replicate database data from the second database to the first database; based at least in part on comparing the replication lag metric to a replication lag threshold, causing a database server to delete data from the second database at an adjusted delete rate. 11 . The method of claim 10 , further comprising: adjusting a pause interval resulting in an adjusted pause interval; serially submitting a plurality of commands to a database server using the adjusted pause interval to cause the database server to delete data from the second database at the adjusted delete rate. 12 . The method of claim 11 , wherein the adjusting the pause interval comprises increasing the pause interval based, at least in part, on both: (a) determining the replication lag metric is above a respective threshold and (b) determining that a second replication lag metric is above a respective threshold. 13 . The method of claim 11 , further comprising: based at least in part on determining that both: (a) the replication lag metric is below a respective threshold and (b) a second replication lag threshold is below a respective threshold, determining the adjusted pause interval by decreasing the pause interval. 14 . The method of claim 10 , further comprising: after a database record associated with a timestamp is replicated from the second database to the first database, reading the database record including the timestamp from the first database; based at least in part on the timestamp of the database record read from the first database, determining the replication lag metric. 15 . The method of claim 10 , wherein the replication lag metric measures replication lag between two databases located in a same data center. 16 . The method of claim 11 , wherein a command of the plurality of commands is a Structured Query Language (SQL) delete command. 17 . The method of claim 11 , wherein a command of the plurality of commands is executed against a database in context of a different database transaction. 18 . One or more non-transitory computer-readable media storing one or more programs, the one or more programs for execution by a computing system comprising one or more processors and a memory, the one or more programs comprising instructions to cause the computing system to perform operations comprising: determining a first replication lag metric based on a first replication process involving a first database and a second database, wherein determining the first replication lag metric is based on a measured time to replicate database data from the second database to the first database; determining a second replication lag metric based on a second replication process involving a third database and the second database, wherein determining the second replication lag metric is based on a measured time to replicate database data from the second database to the third database; based at least in part on comparing the first replication lag metric to a first replication lag threshold and comparing the second replication lag metric to a second replication lag threshold, causing a database server to delete data from the second database at an adjusted delete rate. 19 . The one or more non-transitory computer-readable media of claim 18 , wherein the instructions are to cause the computing system to perform operations comprising: based at least in part on comparing the first replication lag metric to a first replication lag threshold and comparing the second replication lag metric to a second replication lag threshold, adjusting a pause interval resulting in an adjust

Assignees

Inventors

Classifications

  • Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor · CPC title

  • Ensuring data consistency and integrity · CPC title

  • Physics · mapped topic

  • Physics · mapped topic

  • Concurrency control (transaction processing G06F9/466) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2018336237A1 cover?
Computer-implemented techniques for replication-lag constrained deletion of data in a distributed data storage system. In some aspects, the techniques improve the operation of a computing system by preventing too high of a delete rate that causes severe replication lag while at the same time increasing and decreasing the delete rate over time to a maximum allowable delete rate constrained by me…
Who is the assignee on this patent?
Dropbox Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/2365. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Nov 22 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).