Data volume placement techniques
US-9246996-B1 · Jan 26, 2016 · US
US9807164B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9807164-B2 |
| Application number | US-201414341547-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 25, 2014 |
| Priority date | Jul 25, 2014 |
| Publication date | Oct 31, 2017 |
| Grant date | Oct 31, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The disclosure is directed to replicating datasets between data storage servers in a distributed computer network synchronously and asynchronously (“the technology”). A replication interface receives a request from a client to store a dataset in the distributed computer network. The replication interface identifies a first set of storage servers that are within a halo defined by the client. The replication interface replicates the dataset to the first set of the storage servers synchronously, and a remaining set of the storage servers, e.g., storage servers that are outside of the halo asynchronously. The replication interface can perform the synchronous and asynchronous replication simultaneously. The halo can be determined based on various parameters, including a halo latency, which indicates a permissible latency threshold between the client and a storage server to which the dataset is to be replicated synchronously.
Opening claim text (preview).
I claim: 1. A method performed by a computing system, comprising: receiving, at a client computing system in a distributed computer network having a plurality of data storage servers, a request to store a dataset in the distributed computer network, the data storage servers configured to be read-write storage servers; identifying, by the client computing system and based on a halo latency parameter specified in a storage policy, a first set of the data storage servers that are within a halo group defined by the halo latency parameter and a second set of the data storage servers that are outside of the halo group, the halo latency parameter indicating a permissible threshold of a latency between the client computing system and a data storage server of the data storage servers to which the dataset is synchronously replicated, the latency being a time period elapsed between a dispatch of a request from the client computing system to the data storage server and a receipt of the response from the data storage server by the client computing system; and replicating the dataset to: the first set of the data storage servers synchronously, and the second set of the data storage servers asynchronously, the dataset concurrently replicated to the first set and the second set. 2. The method of claim 1 , wherein identifying the first set of the storage servers that are within the halo group includes identifying the first set of the storage servers whose corresponding latencies do not exceed the permissible threshold. 3. The method of claim 1 , wherein identifying the first set of the storage servers within the halo group includes identifying a first group of the storage servers whose corresponding latencies do not exceed the permissible threshold as being in an “online” state and a second group of the storage servers whose corresponding latencies exceed the permissible threshold as being in an “stand-by” state. 4. The method of claim 3 , wherein replicating the dataset to the first set of the storage servers includes replicating the dataset to the first group of the storage servers that are in the “online” state. 5. The method of claim 1 , wherein replicating the dataset to the second set of the storage servers includes replicating, by a daemon program, the dataset from the first set of the storage servers to the second set of the storage servers. 6. The method of claim 5 , wherein the daemon program is configured to identify a first group of the storage servers that are in a second halo group as the second set of the storage servers, the second halo group including the first group of the storage servers having an infinite latency between a specified storage server of the first set of the storage servers on which the daemon program is executing and the first group of the storage servers. 7. The method of claim 1 , wherein identifying the first set of the storage servers within the halo group further includes: determining a number of replicas of the dataset to be stored in the distributed computer network, determining whether the number of replicas exceed a maximum number of replicas to be stored at the first set of the storage servers, responsive to a determination the number of replicas exceed the maximum number of replicas of the first set of the storage servers, identifying a subset of the second set of the storage servers to store a remaining number of replicas that exceed the maximum number of replicas. 8. The method of claim 7 further comprising: replicating the remaining number of replicas of the dataset to the subset of the second set of the storage servers. 9. The method of claim 1 , wherein the client is one of a plurality of clients and at least some of the clients have halo parameters with different latency thresholds. 10. The method of claim 1 , wherein the distributed computer network is a GlusterFS distributed storage file system. 11. A system, comprising: a processor; a first module configured to store a storage policy used in storing a plurality of datasets at a plurality of storage servers in a distributed computer network, wherein at least some of the storage servers are configured to store a replica of a dataset of the datasets stored in another storage server of the storage servers, the storage policy including a halo parameter, the halo parameter including a set of tags associated with the storage servers, each of the set of tags describing a first attribute associated with the storage servers; a second module that is configured to work in cooperation with the processor to receive a request from a client in the distributed computer network to store a first dataset of the datasets in the distributed computer network; a third module that is configured to work in cooperation with the processor to identify a first set of the storage servers having one or more tags that match with the set of tags in the halo parameter as “online” storage servers and a second set of the storage servers whose one or more tags do not match with the set of tags in the halo parameter as “stand-by” storage servers; and a fourth module that is configured to work in cooperation with the processor to synchronously replicate the first dataset to the “online” storage servers and asynchronously replicate the first dataset to the “stand-by” storage servers, the first dataset concurrently replicated to the “online” storage servers and the “stand-by” storage servers. 12. The system of claim 11 , wherein one of the set of tags in the halo parameter is a latency tag, the latency tag describing a permissible latency threshold between the client and a storage server of the storage servers for replicating the first dataset to the storage server synchronously. 13. The system of claim 12 , wherein the third module is configured to identify that the first set of the storage servers match with the halo parameter if latencies of each of the first set of the storage servers with the client do not exceed the permissible latency threshold. 14. The system of claim 11 , wherein the storage servers are configured to store the datasets in a plurality of bricks, wherein a brick of the bricks is a smallest storage unit of a storage server of the servers, and wherein a group of the bricks from the storage servers form a volume. 15. The system of claim 14 , wherein the client is configured to access a subset of the datasets stored at the group of the bricks by mounting the volume on the client, wherein the group of the bricks in the volume contain replicas of the subset of the datasets. 16. The system of claim 14 , wherein the third module is further configured to identify the “online” storage servers and the “stand-by” storage servers by identifying a first subset of the group of the bricks in the volume corresponding to the “online” storage servers as “online” bricks and a second subset of the group of the bricks in the volume corresponding to the “stand-by” storage servers as “stand-by” bricks. 17. A non-transitory computer-readable storage medium storing computer-readable instructions, comprising: instructions for receiving, at a first client in a distributed computer network having a plurality of storage servers, a request to store a first dataset in the distributed computer network, the storage servers configured to be read-write storage servers; instructions for identifying, by the first client and based on a storage policy, a first set of the storage servers to which the first dataset is to be replicated synchronously, the identifying based on a halo latency parameter of the storage policy and a number of repli
based on client or server locations · CPC title
Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.