Parallel optimized remote synchronization of active block storage

US9613046B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-9613046-B1
Application numberUS-201514967543-A
CountryUS
Kind codeB1
Filing dateDec 14, 2015
Priority dateDec 14, 2015
Publication dateApr 4, 2017
Grant dateApr 4, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Updating a second cluster server that backs up a first cluster server includes retrieving a first metadata file from a first cluster server. The first metadata file includes a first ordered list of block identifiers for data blocks stored on a first plurality of block servers. The updating also includes retrieving a second metadata file from a second cluster server. The second metadata file includes a second ordered list of block identifiers for data blocks stored on a second plurality of block servers. The updating also includes comparing the first metadata file to the second metadata file to determine a difference list. The difference list includes block identifiers from the first ordered list that differ from block identifiers of the second ordered list. The updating also includes sending, to the first cluster server, a request for data blocks associated with the block identifiers from the difference list.

First claim

Opening claim text (preview).

What is claimed is: 1. A system comprising: a first cluster server, comprising a first memory and a first one or more processors, backed up by a second cluster server comprising a second memory and a second one or more processors configured to: retrieve a first metadata file from the first cluster server, wherein the first cluster server is different from the second cluster server, wherein the first metadata file comprises a first ordered list of block identifiers for data blocks, wherein each block identifier is used to access a data block stored on one of a first plurality of block servers, wherein each block identifier is a hash based on content of its corresponding data block, and wherein the first ordered list of block identifiers corresponds to a first plurality of volumes of client metadata stored on the first cluster server; retrieve a second metadata file from the second cluster server, wherein the second metadata file comprises a second ordered list of block identifiers for data blocks, wherein each block identifier is used to access a data block stored on one of a second plurality of block servers, wherein each block identifier is a hash based on content of its corresponding data block, and wherein the second ordered list of block identifiers corresponds to a second plurality of volumes of client metadata stored on the second cluster server; compare the first metadata file to the second metadata file to determine a difference list, wherein the difference list comprises block identifiers from the first ordered list that differ from block identifiers of the second ordered list; and send, to the first cluster server, a request for data blocks, wherein the request comprises the block identifiers from the difference list. 2. The system of claim 1 , wherein the one or more processors are further configured to: determine, before the request is sent, whether any of the data blocks associated with the block identifiers from the difference list are in a cache of the second cluster server; retrieve, from the cache of the second cluster server, any of the data blocks associated with the block identifiers from the difference list; and write the data blocks retrieved from the cache of the second cluster server to the second plurality of block servers. 3. The system of claim 1 , wherein each block identifier indicates a location on one of the first or second plurality of block servers where a data block associated with each block identifier should be stored, and wherein the one or more processors are further configured to: group into difference list batches the block identifiers from the difference list based the location indicated by each identifier, wherein each of the batches correspond to one block server in the first or second plurality of block servers; determine, before the request is sent, whether any of the data blocks associated with the block identifiers in each batch are stored on one of the second plurality of block servers; and remove, from the difference list batches, any block identifiers determined to be already stored on one of the second plurality of block servers. 4. The system of claim 3 , wherein the request comprises the difference list batches. 5. The system of claim 1 , wherein each block identifier indicates at least two locations on one of the first or second plurality of block servers where a data block associated with each block identifier should be stored, and wherein the one or more processors are further configured to: group into difference list batches the block identifiers from the difference list based the location indicated by each identifier, wherein each of the batches correspond to at least two block servers in the first or second plurality of block servers; determine, before the request is sent, whether any of the data blocks associated with the block identifiers in each batch are stored on at least two of the second plurality of block servers; and remove, from the difference list batches, any block identifiers determined to be already stored on at least two of the second plurality of block servers. 6. The system of claim 1 , wherein the one or more processors are further configured to: add, to a queue, the block identifiers from the difference list; and request, from the first cluster server, the data blocks associated with the block identifiers in the queue. 7. The system of claim 6 , wherein the one or more processors are further configured to: receive, at the second cluster server, the data blocks associated with the block identifiers in the queue; group into batches the data blocks based on which of the second plurality of block servers the data blocks should be written to as indicated by the block identifiers associated with the data blocks; and write the data blocks from the batches onto the second plurality of block servers according the block identifiers associated with the data blocks. 8. The system of claim 6 , wherein the one or more processors are further configured to: determine a missing data block comprising at least one of the data blocks associated with the block identifiers in the queue that are not stored in the first cluster server; calculate a logical block address (LBA) of the missing data block based on the second ordered list of block identifiers; and send, to the first cluster server, a request for missing data block, wherein the request comprises the LBA. 9. The system of claim 6 , wherein the one or more processors are further configured to: determine missing data blocks comprising at least one of the data blocks associated with the block identifiers in the queue that are not stored in the first cluster server; determine whether block identifiers associated with the missing data blocks are in a cache of the second cluster server; retrieve from the second cluster server, when the block identifiers associated with the missing data blocks are not in the cache of the second cluster server, a second version of the second metadata file; and determine whether the block identifiers associated with the missing data blocks have changed in the second version of the second metadata file. 10. The system of claim 9 , wherein the one or more processors are further configured to replace in the difference list the block identifiers associated with the missing data blocks that have changed with new block identifiers from the second version of the second metadata file. 11. A method comprising: updating a second cluster server, having a second memory and a second one or more processors, that backs up a first cluster server having a first memory and a second one or more processors, wherein the updating comprises: retrieving, by the second one or more processors of the second cluster server, a first metadata file from a first cluster server, wherein the first cluster server is different from the second cluster server, wherein the first metadata file comprises a first ordered list of block identifiers for data blocks, wherein each block identifier is used to access a data block stored on one of a first plurality of block servers, wherein each block identifier is a hash based on content of its corresponding data block, and wherein the first ordered list of block identifiers corresponds to a first plurality of volumes of client metadata stored on the first cluster server; retrieving, by the second one or more processors, a second metadata file from a second cluster server, wherein the second metadata file comprises a second ordered list of block identifiers for data blocks, wherein each block identifier is used to access a data block stored on one of a second plurality of block servers, wherein each block identifier is a hash b

Assignees

Inventors

Classifications

  • Redundant storage or storage space (G06F11/2056 takes precedence) · CPC title

  • for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS] · CPC title

  • Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes · CPC title

  • G06F16/178Primary

    Techniques for file synchronisation in file systems · CPC title

  • the resynchronized component or unit being a persistent storage device (re-synchronization of failed mirror storage G06F11/2082; rebuild or reconstruction of parity RAID storage G06F11/1008) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9613046B1 cover?
Updating a second cluster server that backs up a first cluster server includes retrieving a first metadata file from a first cluster server. The first metadata file includes a first ordered list of block identifiers for data blocks stored on a first plurality of block servers. The updating also includes retrieving a second metadata file from a second cluster server. The second metadata file inc…
Who is the assignee on this patent?
Netapp Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/178. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 04 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).