Verification of backup data across a data pipeline

US12430317B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12430317-B2
Application numberUS-202418426634-A
CountryUS
Kind codeB2
Filing dateJan 30, 2024
Priority dateJan 30, 2024
Publication dateSep 30, 2025
Grant dateSep 30, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems, devices, and techniques are disclosed for verification of backup data across a data pipeline. Records from a first storage may be received at a first end of a data pipeline. The records may be hashed to generate first hashes. A first hash tree may be generated from the first hashes. The records may be received at a second end of the data pipeline. Bits of bitmaps that correspond to the records may be set. The records may be hashed to generate second hashes. The records may be stored in a second storage. A second hash tree may be generated form the second hashes. Using the bitmaps, whether all of the records or any duplicate records were received may be determined. The first hash tree and the second hash tree may be compared to determine if any of the records stored in the second storage are corrupt.

First claim

Opening claim text (preview).

The invention claimed is: 1. A computer-implemented method comprising: receiving at a first end of a data pipeline, from a first storage, records; hashing the records to generate first hashes; generating a first hash tree from the first hashes; receiving, at a second end of the data pipeline, the records; generating bitmaps for unique batches of the records; setting bits of the bitmaps that correspond to the records received at the second end of the data pipeline as the records are received at the second end of the data pipeline; hashing the records as the records are received at the second end of the data pipeline to generate second hashes; storing the records in a second storage as the records are received at the second end of the data pipeline; generating a second hash tree from the second hashes; determining, using the bitmaps, if all of the records were received at the second end of the data pipeline and if any duplicates of any of the records were received at the second end of the data pipeline; and comparing the first hash tree and the second hash tree to determine if any of the records stored in the second storage are corrupt. 2. The computer-implemented method of claim 1 , further comprising preventing access to the records in the second storage until the comparison of the first hash tree and the second hash tree determines that none of the records in the second storage are corrupt. 3. The computer-implemented method of claim 1 , further comprising sending a record to the second end of the data pipeline when it is determined using the bitmaps that the record was not received at the second of the data pipeline. 4. The computer-implemented method of claim 1 , further comprising assigning the records batch identifiers and ordinal identifiers after the records are received at the first end of the data pipeline and before the records are received at the second end of the data pipeline. 5. The computer-implemented method of claim 4 , further comprising removing the batch identifiers and the ordinal identifiers from the records before storing the records in the second storage. 6. The computer-implemented method of claim 1 , wherein generating the second hash tree further comprises using the batch identifiers and the ordinal identifiers from the records. 7. The computer-implemented method of claim 1 , wherein the records further comprise object identifiers, and further comprising generating additional records for the unique batches of the records, the additional records comprising batch identifiers, starting object identifiers, and ending object identifiers. 8. A computer-implemented system comprising: a first storage and a second storage; and one or more processors that receive at a first end of a data pipeline, from the first storage, records, hash the records to generate first hashes, generate a first hash tree from the first hashes, receive, at a second end of the data pipeline, the records; generate bitmaps for unique batches of the records; set bits of the bitmaps that correspond to the records received at the second end of the data pipeline as the records are received at the second end of the data pipeline, hash the records as the records are received at the second end of the data pipeline to generate second hashes; store the records in the second storage as the records are received at the second end of the data pipeline; generate a second hash tree from the second hashes, determine, using the bitmaps, if all of the records were received at the second end of the data pipeline and if any duplicates of any of the records were received at the second end of the data pipeline, and compare the first hash tree and the second hash tree to determine if any of the records stored in the second storage are corrupt. 9. The computer-implemented system of claim 8 , wherein the one or more processors further prevent access to the records in the second storage until the comparison of the first hash tree and the second hash tree determines that none of the records in the second storage are corrupt. 10. The computer-implemented system of claim 8 , wherein the one or more processors further send a record to the second end of the data pipeline when it is determined using the bitmaps that the record was not received at the second of the data pipeline. 11. The computer-implemented system of claim 8 , wherein the one or more processors assign the records batch identifiers and ordinal identifiers after the records received at the first end of the data pipeline and before the records are received at the second end of the data pipeline. 12. The computer-implemented system of claim 8 , wherein the one or more processors further remove the batch identifiers and the ordinal identifiers from the records before storing the records in the second storage. 13. The computer-implemented system of claim 8 , wherein the one or more processors generate the second hash tree by using the batch identifiers and the ordinal identifiers from the records. 14. The computer-implemented system of claim 8 , wherein the records further comprise object identifiers, and wherein the one or more processors further generate additional records for the unique batches of the records, the additional records comprising batch identifiers, starting object identifiers, and ending object identifiers. 15. A system comprising: one or more computers and one or more non-transitory storage devices storing instructions which are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising: receiving at a first end of a data pipeline, from a first storage, records; hashing the records to generate first hashes; generating a first hash tree from the first hashes; receiving, at a second end of the data pipeline, the records; generating bitmaps for unique batches of the records; setting bits of the bitmaps that correspond to the records received at the second end of the data pipeline as the records are received at the second end of the data pipeline; hashing the records as the records are received at the second end of the data pipeline to generate second hashes; storing the records in a second storage as the records are received at the second end of the data pipeline; generating a second hash tree from the second hashes; determining, using the bitmaps, if all of the records were received at the second end of the data pipeline and if any duplicates of any of the records were received at the second end of the data pipeline; and comparing the first hash tree and the second hash tree to determine if any of the records stored in the second storage are corrupt. 16. The system of claim 15 , wherein the one or more computers and one or more non-transitory storage devices further store instructions which are operable, when executed by the one or more computers, to cause the one or more computers to further perform operations comprising: preventing access to the records in the second storage until the comparison of the first hash tree and the second hash tree determines that none of the records in the second storage are corrupt. 17. The system of claim 15 , wherein the one or more computers and one or more non-transitory storage devices further store instructions which are operable, when executed by the one or more computers, to cause the one or more computers to further perform operations comprising: sending a record to the second end of the data pipeline when it is determined using the bitmaps that the record was not received at the second of the da

Assignees

Inventors

Classifications

  • Management of the backup or restore process · CPC title

  • Using snapshots, i.e. a logical point-in-time copy of the data · CPC title

  • Management of the data involved in backup or backup restore · CPC title

  • Ensuring data consistency and integrity · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12430317B2 cover?
Systems, devices, and techniques are disclosed for verification of backup data across a data pipeline. Records from a first storage may be received at a first end of a data pipeline. The records may be hashed to generate first hashes. A first hash tree may be generated from the first hashes. The records may be received at a second end of the data pipeline. Bits of bitmaps that correspond to the…
Who is the assignee on this patent?
Salesforce Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/2365. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 30 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).