Determining Data Replication Cost for Cloud Based Application
US-2017147672-A1 · May 25, 2017 · US
US11681677B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11681677-B2 |
| Application number | US-202016803923-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 27, 2020 |
| Priority date | Feb 27, 2020 |
| Publication date | Jun 20, 2023 |
| Grant date | Jun 20, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A geographically diverse data storage system that can protect data via replication of data among relevant zones according to a determined replication topology is disclosed. The replication topology can be determined based on replication times between the relevant zones. In an aspect, a tree topology can provide advantages over a star topography. In an embodiment, a tree topology can be generated, or an existing topology can be modified, via selection of a next replication task(s) based on the replication times. In an aspect, the replication times can be determined from measurable characteristics of the geographically diverse data storage system. In some embodiments, the replications times can be based on historical measurements, time limited historical measurements, inferences from machine learning, etc. A determined topology can be ranked relative to other viable topologies based on criteria such as speed, monetary cost, computing resource usage, etc. Accordingly, a selected topology, or selected modification to a topology, can provide for improved replication that can provide protection for data stored in the geographically diverse data storage system.
Opening claim text (preview).
What is claimed is: 1. A system, comprising: a processor; and a memory that stores executable instructions that, when executed by the processor, facilitate performance of operations, comprising: receiving an indication of replication times between pairs of zones comprised in a geographically diverse data storage system comprising a first zone, a second zone, and a third zone; determining a first replication operation between the first zone and the second zone based on a first value of the replication times and adding the first zone and the second zone to a tree set; determining a second replication operation between a zone of the tree set and the third zone based on a second value of the replication times and adding the third zone to the tree set; selecting a preferred replication topology based on a ranking of the first replication operation and the second replication operation among other replication operations determined for the pairs of zones from the replication times, wherein the ranking is based on a monetary cost of replication, a speed of replication, a reliability of replication, and satisfaction of a customer requirement for replication of a data chunk between the pairs of zones, wherein the data chunk comprises data stored in an append-only format according to an order in which the data was received by the geographically diverse data storage system, and wherein the data chunk is sealed prior to replication, causing the data chunk to be immutable; and replicating the data chunk among the pairs of zones according to the preferred replication topology, resulting in a replicated data chunk. 2. The system of claim 1 , wherein the first zone is located remotely from the second zone, and wherein the first zone is located remotely from the third zone. 3. The system of claim 1 , wherein the second zone is located remotely from the third zone. 4. The system of claim 1 , wherein the ranking is based, at least in part, on determining the first value of the replication times is lower than another value of the replication times. 5. The system of claim 1 , wherein the ranking is based, at least in part, on determining the first value of the replication times is the same as another value of the replication times, and is further based, at least in part, on determining that employing a zone corresponding to the first value results in shorter tree topology than employing another zone corresponding to the other value of the replication times. 6. The system of claim 1 , wherein the ranking is in response to a determining that a characteristic of the geographically diverse data storage system has transitioned a threshold value. 7. The system of claim 6 , wherein the threshold value is a replication time value of the replication times. 8. The system of claim 6 , wherein the threshold value is an amount of change in a replication time value of the replication times. 9. The system of claim 1 , wherein the operations further comprise iteratively determining another replication operation of the other replication operations, and wherein the other replication operation is between a zone of the tree set and another zone of the geographically diverse data storage system based on another value of the replication times and adding the other zone to the tree set. 10. The system of claim 1 , wherein the ranking of the replication operations excludes unavailable topology schemes. 11. The system of claim 1 , wherein the replicating the data chunk according to the preferred replication topology results in generating a protection set via replication of data chunks comprising the data chunk among zones comprised in the tree set. 12. The system of claim 9 , wherein the iteratively determining another replication operation between a zone of the tree set and another zone results in a third replication operation that occurs in parallel with the second replication operation. 13. A method, comprising: performing, by a system comprising a processor, a first iteration of operations comprising: in response to receiving, by the system, an indication of replication times between a pair of zones comprised in a geographically diverse data storage system comprising a first zone, a second zone, and a third zone, determining a first replication operation between the first zone and the second zone based on a first value of the replication times and adding the first zone and the second zone to a tree set; determining, by the system, a second replication operation between a zone of the tree set and the third zone based on a second value of the replication times and adding the third zone to the tree set; selecting, by the system, a preferred replication topology based on ranking viable replication topologies, wherein the ranking the viable replication topologies is based, at least in part, on the replication times, a monetary cost of replication, a reliability of replication, and satisfaction of a customer requirement for replicating a data chunk between pairs of zones, and wherein the preferred replication topology comprises the first replication operation and the second replication operation; and initiating, by the system, a replication of the data chunk in accord with the preferred replication topology, wherein the data chunk comprises data stored in an append-only format according to an order in which the data was received by the geographically diverse data storage system, and wherein the data chunk becomes immutable by sealing the data chunk prior to performing the replication. 14. The method of claim 13 , wherein the operations further comprise: in response to determining, by the system, that there is a relevant zone of the geographically diverse data storage system to be added to the tree set, iteratively determining at least another replication operation between a zone of the tree set and at least another zone of the geographically diverse data storage system based on at least another value of the replication times and adding at least the other zone to the tree set, resulting in in topology scheme of the viable topology schemes. 15. The method of claim 14 , wherein the iteratively determining at least the other replication operation results in a third replication operation that occurs in parallel with the second replication operation. 16. The method of claim 13 , wherein the determining the first replication operation results in the first replication operation being between remotely located zones. 17. A non-transitory machine-readable storage medium, comprising executable instructions that, when executed by a processor, facilitate performance of operations, comprising: determining that an indication of replication times between a pair of zones comprised in a geographically diverse data storage system comprising a first zone, a second zone, and a third zone, satisfies a rule related to a threshold value; determining a first replication operation between the first zone and the second zone based on a first value of the replication times and adding the first zone and the second zone to a tree set; determining a second replication operation between a zone of the tree set and the third zone based on a second value of the replication times and adding the third zone to the tree set; ranking viable replication topologies, wherein the ranking the viable replication topologies is based, at least in part, on the replication times, a monetary cost of replications, a reliability of replications, and satisfaction of a customer requirement for replicating a data chunk between pairs of zones, and wherein a selected replication
Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor · CPC title
Distributed file systems · CPC title
Trees, e.g. B+trees · CPC title
Parallel file systems, i.e. file systems supporting multiple processors · CPC title
Applying rules; Deductive queries · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.