Replicated data integrity

US11036677B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-11036677-B1
Application numberUS-201816174498-A
CountryUS
Kind codeB1
Filing dateOct 30, 2018
Priority dateDec 14, 2017
Publication dateJun 15, 2021
Grant dateJun 15, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Performing replicated data integrity, including: generating, at a first computer system, a local hash of a local dataset; replicating the local dataset; receiving, at the first computer system from a second computer system, a remote hash of a remote dataset generated from the local dataset replicated from the first computer system; and determining, based at least on a comparison of the local hash of the local dataset with the remote hash of the remote dataset, validity of the remote dataset generated from the local dataset replicated from the first computer system.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: replicating the local dataset from a first computer system to a second computer system; receiving, at the first computer system from the second computer system, a remote hash of a remote dataset generated from the local dataset replicated from the first computer system; and determining, by the first computer system, based at least on a comparison of a local hash of the local dataset with the remote hash of the remote dataset, validity of the remote dataset generated from the local dataset replicated from the first computer system, wherein the local dataset is a subset of a volume of data selected based at least upon a hashing policy, and wherein the hashing policy specifies random portions of the volume of data for hashing. 2. The method of claim 1 , wherein the first computer system is a first storage system, wherein the second computer system is a second storage system, and wherein the dataset is synchronously replicated between the first storage system and the second storage system. 3. The method of claim 2 , wherein determining the validity of the remote dataset generated from the local dataset replicated from the first computer system to the second computer system is performed as part of synchronizing the dataset among the first storage system and the second storage system. 4. The method of claim 2 , wherein the dataset is asynchronously replicated between the first computer system and the second computer system. 5. The method of claim 1 , wherein the local dataset is a first snapshot of a volume of data at a first point in time, and wherein the method further comprises: generating a second snapshot of the volume of data at a second point in time, wherein the second snapshot comprises differences between the volume of data at the first point in time and the volume of data at the second point in time; generating an incremental local hash of the second snapshot of the volume of data at the second point in time; replicating the second snapshot; receiving, at the first computer system, an incremental remote hash of a remote second snapshot generated from the local second snapshot replicated from the first computer system; and determining, based at least on a comparison of the incremental local hash with the incremental remote hash of the remote second snapshot generated from the local second snapshot replicated from the first computer system, validity of the remote second snapshot. 6. The method of claim 1 , wherein, periodically or aperiodically after determining validity of the remote dataset, the first computer system continues to determine validity of the remote dataset by generating a current local hash of the dataset and comparing the current local hash to a requested current remote hash of the dataset from the second computer system. 7. An apparatus comprising a computer processor, a computer memory operatively coupled to the computer processor, the computer memory having disposed within it computer program instructions that, when executed by the computer processor, cause the apparatus to carry out: replicating the local dataset from a first computer system to a second computer system; receiving, at the first computer system from the second computer system, a remote hash of a remote dataset generated from the local dataset replicated from the first computer system; and determining, by the first computer system, based at least on a comparison of a local hash of the local dataset with the remote hash of the remote dataset, validity of the remote dataset generated from the local dataset replicated from the first computer system, wherein the local dataset is a subset of a volume of data selected based at least upon a hashing policy, and wherein the hashing policy specifies random portions of the volume of data for hashing. 8. The apparatus of claim 7 , wherein the first computer system is a first storage system, wherein the second computer system is a second storage system, and wherein the dataset is synchronously replicated between the first storage system and the second storage system. 9. The apparatus of claim 8 , wherein determining the validity of the remote dataset generated from the local dataset replicated from the first computer system to the second computer system is performed as part of synchronizing the dataset among the first storage system and the second storage system. 10. The apparatus of claim 8 , wherein the dataset is asynchronously replicated between the first computer system and the second computer system. 11. The apparatus of claim 10 , wherein the local dataset is a first snapshot of a volume of data at a first point in time, and wherein the computer program instructions further cause the apparatus to carry out the steps of: generating a second snapshot of the volume of data at a second point in time, wherein the second snapshot comprises differences between the volume of data at the first point in time and the volume of data at the second point in time; generating an incremental local hash of the second snapshot of the volume of data at the second point in time; replicating, from the first computer system to a second computer system, the second snapshot; receiving, at the first computer system from the second computer system, an incremental remote hash of a remote second snapshot generated from the local second snapshot replicated from the first computer system to the second computer system; and determining, based at least on a comparison of the incremental local hash with the incremental remote hash of the remote second snapshot generated from the local second snapshot replicated from the first computer system to the second computer system, validity of the remote second snapshot. 12. The apparatus of claim 7 , wherein, periodically or aperiodically after determining validity of the remote dataset, the first computer system continues to determine validity of the remote dataset by generating a current local hash of the dataset and comparing the current local hash to a requested current remote hash of the dataset from the second computer system. 13. A computer program product disposed upon a computer readable medium, the computer program product comprising computer program instructions that, when executed, cause a computer to carry out: replicating the local dataset from a first computer system to a second computer system; receiving, at the first computer system from the second computer system, a remote hash of a remote dataset generated from the local dataset replicated from the first computer system; and determining, by the first computer system, based at least on a comparison of a local hash of the local dataset with the remote hash of the remote dataset, validity of the remote dataset generated from the local dataset replicated from the first computer system, wherein the local dataset is a subset of a volume of data selected based at least upon a hashing policy, and wherein the hashing policy specifies random portions of the volume of data for hashing. 14. The computer program product of claim 13 , wherein the first computer system is a first storage system, wherein the second computer system is a second storage system, and wherein the dataset is synchronously replicated between the first storage system and the second storage system. 15. The computer program product of claim 14 , wherein determining the validity of the remote dataset generated from the local dataset replicated from the first computer system to the second computer system is performed as part of synchronizing the dataset among the first storage system and the sec

Assignees

Inventors

Classifications

  • Provision of network file services by network file servers, e.g. by using NFS, CIFS (network file access protocols H04L67/1097) · CPC title

  • Protecting data integrity, e.g. using checksums, certificates or signatures · CPC title

  • by registering files or documents with a third party · CPC title

  • G06F3/065Primary

    Replication mechanisms · CPC title

  • Non-volatile semiconductor memory arrays · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11036677B1 cover?
Performing replicated data integrity, including: generating, at a first computer system, a local hash of a local dataset; replicating the local dataset; receiving, at the first computer system from a second computer system, a remote hash of a remote dataset generated from the local dataset replicated from the first computer system; and determining, based at least on a comparison of the local ha…
Who is the assignee on this patent?
Pure Storage Inc
What technology area does this patent fall under?
Primary CPC classification G06F3/065. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 15 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).