Systems and methods for facilitating management of data
US-9298417-B1 · Mar 29, 2016 · US
US9613046B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-9613046-B1 |
| Application number | US-201514967543-A |
| Country | US |
| Kind code | B1 |
| Filing date | Dec 14, 2015 |
| Priority date | Dec 14, 2015 |
| Publication date | Apr 4, 2017 |
| Grant date | Apr 4, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Updating a second cluster server that backs up a first cluster server includes retrieving a first metadata file from a first cluster server. The first metadata file includes a first ordered list of block identifiers for data blocks stored on a first plurality of block servers. The updating also includes retrieving a second metadata file from a second cluster server. The second metadata file includes a second ordered list of block identifiers for data blocks stored on a second plurality of block servers. The updating also includes comparing the first metadata file to the second metadata file to determine a difference list. The difference list includes block identifiers from the first ordered list that differ from block identifiers of the second ordered list. The updating also includes sending, to the first cluster server, a request for data blocks associated with the block identifiers from the difference list.
Opening claim text (preview).
What is claimed is: 1. A system comprising: a first cluster server, comprising a first memory and a first one or more processors, backed up by a second cluster server comprising a second memory and a second one or more processors configured to: retrieve a first metadata file from the first cluster server, wherein the first cluster server is different from the second cluster server, wherein the first metadata file comprises a first ordered list of block identifiers for data blocks, wherein each block identifier is used to access a data block stored on one of a first plurality of block servers, wherein each block identifier is a hash based on content of its corresponding data block, and wherein the first ordered list of block identifiers corresponds to a first plurality of volumes of client metadata stored on the first cluster server; retrieve a second metadata file from the second cluster server, wherein the second metadata file comprises a second ordered list of block identifiers for data blocks, wherein each block identifier is used to access a data block stored on one of a second plurality of block servers, wherein each block identifier is a hash based on content of its corresponding data block, and wherein the second ordered list of block identifiers corresponds to a second plurality of volumes of client metadata stored on the second cluster server; compare the first metadata file to the second metadata file to determine a difference list, wherein the difference list comprises block identifiers from the first ordered list that differ from block identifiers of the second ordered list; and send, to the first cluster server, a request for data blocks, wherein the request comprises the block identifiers from the difference list. 2. The system of claim 1 , wherein the one or more processors are further configured to: determine, before the request is sent, whether any of the data blocks associated with the block identifiers from the difference list are in a cache of the second cluster server; retrieve, from the cache of the second cluster server, any of the data blocks associated with the block identifiers from the difference list; and write the data blocks retrieved from the cache of the second cluster server to the second plurality of block servers. 3. The system of claim 1 , wherein each block identifier indicates a location on one of the first or second plurality of block servers where a data block associated with each block identifier should be stored, and wherein the one or more processors are further configured to: group into difference list batches the block identifiers from the difference list based the location indicated by each identifier, wherein each of the batches correspond to one block server in the first or second plurality of block servers; determine, before the request is sent, whether any of the data blocks associated with the block identifiers in each batch are stored on one of the second plurality of block servers; and remove, from the difference list batches, any block identifiers determined to be already stored on one of the second plurality of block servers. 4. The system of claim 3 , wherein the request comprises the difference list batches. 5. The system of claim 1 , wherein each block identifier indicates at least two locations on one of the first or second plurality of block servers where a data block associated with each block identifier should be stored, and wherein the one or more processors are further configured to: group into difference list batches the block identifiers from the difference list based the location indicated by each identifier, wherein each of the batches correspond to at least two block servers in the first or second plurality of block servers; determine, before the request is sent, whether any of the data blocks associated with the block identifiers in each batch are stored on at least two of the second plurality of block servers; and remove, from the difference list batches, any block identifiers determined to be already stored on at least two of the second plurality of block servers. 6. The system of claim 1 , wherein the one or more processors are further configured to: add, to a queue, the block identifiers from the difference list; and request, from the first cluster server, the data blocks associated with the block identifiers in the queue. 7. The system of claim 6 , wherein the one or more processors are further configured to: receive, at the second cluster server, the data blocks associated with the block identifiers in the queue; group into batches the data blocks based on which of the second plurality of block servers the data blocks should be written to as indicated by the block identifiers associated with the data blocks; and write the data blocks from the batches onto the second plurality of block servers according the block identifiers associated with the data blocks. 8. The system of claim 6 , wherein the one or more processors are further configured to: determine a missing data block comprising at least one of the data blocks associated with the block identifiers in the queue that are not stored in the first cluster server; calculate a logical block address (LBA) of the missing data block based on the second ordered list of block identifiers; and send, to the first cluster server, a request for missing data block, wherein the request comprises the LBA. 9. The system of claim 6 , wherein the one or more processors are further configured to: determine missing data blocks comprising at least one of the data blocks associated with the block identifiers in the queue that are not stored in the first cluster server; determine whether block identifiers associated with the missing data blocks are in a cache of the second cluster server; retrieve from the second cluster server, when the block identifiers associated with the missing data blocks are not in the cache of the second cluster server, a second version of the second metadata file; and determine whether the block identifiers associated with the missing data blocks have changed in the second version of the second metadata file. 10. The system of claim 9 , wherein the one or more processors are further configured to replace in the difference list the block identifiers associated with the missing data blocks that have changed with new block identifiers from the second version of the second metadata file. 11. A method comprising: updating a second cluster server, having a second memory and a second one or more processors, that backs up a first cluster server having a first memory and a second one or more processors, wherein the updating comprises: retrieving, by the second one or more processors of the second cluster server, a first metadata file from a first cluster server, wherein the first cluster server is different from the second cluster server, wherein the first metadata file comprises a first ordered list of block identifiers for data blocks, wherein each block identifier is used to access a data block stored on one of a first plurality of block servers, wherein each block identifier is a hash based on content of its corresponding data block, and wherein the first ordered list of block identifiers corresponds to a first plurality of volumes of client metadata stored on the first cluster server; retrieving, by the second one or more processors, a second metadata file from a second cluster server, wherein the second metadata file comprises a second ordered list of block identifiers for data blocks, wherein each block identifier is used to access a data block stored on one of a second plurality of block servers, wherein each block identifier is a hash b
Redundant storage or storage space (G06F11/2056 takes precedence) · CPC title
for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS] · CPC title
Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes · CPC title
Techniques for file synchronisation in file systems · CPC title
the resynchronized component or unit being a persistent storage device (re-synchronization of failed mirror storage G06F11/2082; rebuild or reconstruction of parity RAID storage G06F11/1008) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.