Synchronous replication of high throughput streaming data

US11579778B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11579778-B2
Application numberUS-202017098306-A
CountryUS
Kind codeB2
Filing dateNov 13, 2020
Priority dateNov 13, 2020
Publication dateFeb 14, 2023
Grant dateFeb 14, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for synchronous replication of stream data includes receiving a stream of data blocks for storage at a first storage location associated with a first geographical region and at a second storage location associated with a second geographical region. The method also includes synchronously writing the stream of data blocks to the first storage location and to the second storage location. While synchronously writing the stream of data blocks, the method includes determining an unrecoverable failure at the second storage location. The method also includes determining a failure point in the writing of the stream of data blocks that demarcates data blocks that were successfully written and not successfully written to the second storage location. The method also includes synchronously writing, starting at the failure point, the stream of data blocks to the first storage location and to a third storage location associated with a third geographical region.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving, at data processing hardware, a stream of data blocks for storage at a first storage location of a distributed storage system and at a second storage location of the distributed storage system, the first storage location associated with a first geographical region and the second storage location associated with a second geographical region different than the first geographical region; synchronously writing, by the data processing hardware, the stream of data blocks to the first storage location and to the second storage location; while synchronously writing the stream of data blocks to the first storage location and to the second storage location, determining, by the data processing hardware, an unrecoverable failure at the second storage location that prohibits further writing of the stream of data blocks to the second storage location; determining, by the data processing hardware, a failure point in the writing of the stream of data blocks, the failure point demarcating data blocks that were successfully written to the second storage location and data blocks that were not successfully written to the second storage location; and while writing of the stream of data blocks to the second storage location is prohibited by the unrecoverable failure, synchronously writing, by the data processing hardware, starting at the failure point, the stream of data blocks to the first storage location and to a third storage location of the distributed storage system, the third storage location associated with a third geographical region different than the first geographical region and the second geographical region, wherein determining the failure point in the writing of the stream of data blocks comprises: determining whether a first replication log is available indicating the data blocks that have been successfully committed to the first storage location; determining whether a second replication log is available indicating the data blocks that have been successfully committed to the second storage location; and when the first replication log and the second replication log are available, reconciling, based on a length of the first replication log and a length of the second replication log, the first replication log and the second replication log, and wherein reconciling the first replication log and the second replication log comprises: determining an index of the second replication log associated with the unrecoverable failure; storing the index of the second replication log on memory hardware in communication with the data processing hardware; finalizing the second replication log to prohibit further writes to the second storage location; and generating a sentinel file to indicate a need for reconciliation. 2. The method of claim 1 , further comprising, asynchronously writing, by the data processing hardware, from a beginning point of the stream of data blocks to the failure point, the stream of data blocks to the third storage location. 3. The method of claim 1 , wherein determining the unrecoverable failure at the second storage location that prohibits further writing of the stream of data blocks to the second storage location comprises: determining a failure of the writing of the stream of data blocks to the second storage location; in response to determining the failure of writing the stream of data blocks to the second storage location, retrying writing the stream of data blocks to the second storage location; and when retrying writing the stream of data blocks to the second storage location has failed, determining that the failure is an unrecoverable failure. 4. The method of claim 1 , further comprising, when the first replication log is available and the second replication log is not available, reconciling, by the data processing hardware, based on the length of the first replication log, the first replication log and the second replication log. 5. The method of claim 1 , further comprising, when the first replication log is not available and the second replication log is available, reconciling, by the data processing hardware, based on the length of the second replication log, the first replication log and the second replication log. 6. A method comprising: receiving, at data processing hardware, a stream of data blocks for storage at a first storage location of a distributed storage system and at a second storage location of the distributed storage system, the first storage location associated with a first geographical region and the second storage location associated with a second geographical region different than the first geographical region; synchronously writing, by the data processing hardware, the stream of data blocks to the first storage location and to the second storage location; while synchronously writing the stream of data blocks to the first storage location and to the second storage location, determining, by the data processing hardware, an unrecoverable failure at the second storage location that prohibits further writing of the stream of data blocks to the second storage location; determining, by the data processing hardware, a failure point in the writing of the stream of data blocks, the failure point demarcating data blocks that were successfully written to the second storage location and data blocks that were not successfully written to the second storage location; while writing of the stream of data blocks to the second storage location is prohibited by the unrecoverable failure, synchronously writing, by the data processing hardware, starting at the failure point, the stream of data blocks to the first storage location and to a third storage location of the distributed storage system, the third storage location associated with a third geographical region different than the first geographical region and the second geographical region; generating, by the data processing hardware, a first replication log comprising timestamps indicating when each data block is written to the first storage location; generating, by the data processing hardware, a second replication log comprising timestamps indicating when each data block is written to the second storage location; receiving, at the data processing hardware, a query request requesting return of a plurality of data blocks stored at the first storage location; reconciling, by the data processing hardware, based on a length of the first replication log and a length of the second replication log, the first replication log and the second replication log; returning, by the data processing hardware, based on the reconciliation of the first replication log and the second replication log, the requested plurality of data blocks; determining that the length of the second replication log is not available; and determining, within a threshold period of time, that a subsequent write is added to the first replication log. 7. A system comprising: data processing hardware; and memory hardware in communication with the data processing hardware, the memory hardware storing instructions that when executed on the data processing hardware cause the data processing hardware to perform operations comprising: receiving a stream of data blocks for storage at a first storage location of a distributed storage system and at a second storage location of the distributed storage system, the first storage location associated with a first geographical region and the second storage location associated with a second geographical region different than the first geographical region; synchronously writing the stream of data blocks to the first storage location and to the second storage location; while synchronously writing the stream of data blocks to the first st

Assignees

Inventors

Classifications

  • by changing the path, e.g. traffic rerouting, path reconfiguration · CPC title

  • where the computing system component is a storage system, e.g. DASD based or network based (digital input from or digital output to record carriers G06F3/06; digital recording or reproducing G11B20/18; for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS], H04L67/1097) · CPC title

  • Replication mechanisms · CPC title

  • Command handling arrangements, e.g. command buffers, queues, command scheduling · CPC title

  • to make the backup process non-disruptive · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11579778B2 cover?
A method for synchronous replication of stream data includes receiving a stream of data blocks for storage at a first storage location associated with a first geographical region and at a second storage location associated with a second geographical region. The method also includes synchronously writing the stream of data blocks to the first storage location and to the second storage location. …
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G06F11/1466. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 14 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).