Method for lineage sampling to efficiently detect corruptions

US12079198B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12079198-B2
Application numberUS-202217936983-A
CountryUS
Kind codeB2
Filing dateSep 30, 2022
Priority dateSep 30, 2022
Publication dateSep 3, 2024
Grant dateSep 3, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Corruption detection in backups is disclosed. Backups that are received into a backup environment are stored in corresponding lineages. A detection engine is configured to perform corruption detection operations on the most recent backups in each of the lineages based on a sample frequency. Corruption detection operations may also be performed randomly and based on unexpected or unusual changes in backup metadata.

First claim

Opening claim text (preview).

What is claimed is: 1. A non-transitory storage medium having stored therein instructions that are executable by one or more hardware processors to perform operations for performing a corruption detection operation that includes read operations in a data protection system, the operations comprising: receiving the backups into the data protection system configured to store the backups, wherein the backups are associated with corresponding lineages, wherein each of the lineages is associated with a sample frequency, wherein the sample frequency for some lineages is different from the sample frequency of other lineages; selecting lineages subject to the corruption detection operation based on the sample frequencies of the lineages; and performing the corruption detection operation on a most recent backup in each of the selected lineages according to the associated sample frequency; and determining whether a logical size of the most recent backups is less than or equal to a capacity threshold of the backup environment, wherein the selected lineages are pruned when the logical size is greater than the capacity threshold. 2. The non-transitory storage medium of claim 1 , wherein the backups comprise synthetic full backups and/or always full backups and wherein the corruption detection operation is not performed on some of the backups in some of the lineages. 3. The non-transitory storage medium of claim 1 , wherein the sample frequency for some lineages is different from the sample frequency of other lineages. 4. The non-transitory storage medium of claim 1 , wherein the sample frequency is included in metadata associated with the backups. 5. The non-transitory storage medium of claim 1 , further comprising selecting at least one of the lineages randomly rather than the associated sample frequency. 6. The non-transitory storage medium of claim 1 , further comprising pruning based on a criticality of the selected lineages. 7. The non-transitory storage medium of claim 1 , further comprising triggering dynamic scans based on metadata changes. 8. The non-transitory storage medium of claim 7 , wherein metadata changes that trigger dynamic scans include a file size change greater than a threshold amount or an unexpected input/output (IO) pattern. 9. The non-transitory storage medium of claim 1 , wherein the backup environment comprises a physical or virtual appliance that is accessed via an air gap. 10. A method for performing a corruption detection operation that includes read operations in a data protection system comprising: receiving the backups into the data protection system configured to store the backups, wherein the backups are associated with corresponding lineages, wherein each of the lineages is associated with a sample frequency, wherein the sample frequency for some lineages is different from the sample frequency of other lineages; selecting lineages subject to the corruption detection operation based on the sample frequencies of the lineages; performing the corruption detection operation on a most recent backup in each of the selected lineages according to the associated sample frequency; determining whether a logical size of the most recent backups is less than or equal to a capacity threshold of the backup environment, wherein the selected lineages are pruned when the logical size is greater than the capacity threshold based on a criticality; and triggering the corruption detection operation dynamically based on metadata changes. 11. The method of claim 10 , further comprising pruning based on a criticality of the selected lineages, wherein pruning includes one or more of: skipping or delaying lineages with a lower sample frequency; and skipping or delaying lineages in a lower tier storage. 12. The method of claim 10 , wherein the sample frequency does not apply to incremental backups, wherein the corruption detection operation is performed on a set of incremental backups. 13. A method for performing a corruption detection operation that includes read operations in a data protection system, comprising: receiving the backups into the data protection stem configured to store the backups, wherein the backups are associated with corresponding lineages, wherein each of the lineages is associated with a sample frequency, wherein the sample frequency for some lineages is different from the sample frequency of other lineages; selecting lineages subject to the corruption detection operation based on the sample frequencies of the lineages; and performing the corruption detection operation on a most recent backup in each of the selected lineages according to the associated sample frequency. 14. The method of claim 13 , wherein the backups comprise synthetic full backups and/or always full backups and wherein the corruption detection operation is not performed on some of the backups in some of the lineages. 15. The method of claim 13 , wherein the sample frequency is included in metadata associated with the backups. 16. The method of claim 13 , further comprising selecting at least one of the lineages randomly rather than the associated sample frequency. 17. The method of claim 13 , further comprising triggering dynamic scans based on metadata changes. 18. The method of claim 17 , wherein metadata changes that trigger dynamic scans include a file size change greater than a threshold amount or an unexpected input/output (IO) pattern. 19. The method of claim 13 , wherein the backup environment comprises a physical or virtual appliance that is accessed via an air gap.

Assignees

Inventors

Classifications

  • by selection of backup contents · CPC title

  • Using snapshots, i.e. a logical point-in-time copy of the data · CPC title

  • Ensuring data consistency and integrity · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12079198B2 cover?
Corruption detection in backups is disclosed. Backups that are received into a backup environment are stored in corresponding lineages. A detection engine is configured to perform corruption detection operations on the most recent backups in each of the lineages based on a sample frequency. Corruption detection operations may also be performed randomly and based on unexpected or unusual changes…
Who is the assignee on this patent?
Dell Products Lp
What technology area does this patent fall under?
Primary CPC classification G06F11/1451. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 03 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).