Server-side inline generation of virtual synthetic backups using group fingerprints

US12222821B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12222821-B2
Application numberUS-202318365456-A
CountryUS
Kind codeB2
Filing dateAug 4, 2023
Priority dateJul 26, 2022
Publication dateFeb 11, 2025
Grant dateFeb 11, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Making inline deduplicated backups of protected data using group fingerprints resident in a storage server by generating group fingerprints on a storage server for a backup client that is not capable of using group fingerprints, from individual fingerprints generated for each segment of protected data divided into variable size segments and then grouped together. Each fingerprint comprises a signature for a respective data segment. The method further maintains the group fingerprints for files resident on the storage server, compares, in the storage server, respective group fingerprints for these files with a new backup file to be backed up from the backup client to determine duplicated data between these files, and converts the new backup file to a virtual synthetic backup during a backup time of the new file.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method of making inline deduplicated backups of protected data using group fingerprints resident in a storage server, the method comprising: generating group fingerprints on a storage server and for a backup client that is not capable of generating group fingerprints, from individual fingerprints generated for each segment of protected data divided into variable size segments and then grouped together, wherein each fingerprint comprises a signature for a respective data segment; maintaining the group fingerprints for files resident on the storage server, wherein the storage server is part of a deduplicated backup system comprising a Data Domain file system (DDFS) and a Data Domain Bandwidth Optimized Open Storage Technology (DDBoost) library that links with the application to reduce bandwidth required for data ingests, and which translates application read and write request to DDBoost application program interfaces (APIs); comparing, in the storage server, respective group fingerprints for backup files already resident on the storage server with a new backup file to be backed up from the backup client to determine duplicated data between a new file and those already residing on the system; comparing, in the storage server, respective group fingerprints for existing backup files resident on the server with new backup file to be backed up from the backup client to determine duplicated data between these files; and creating the new backup file using virtual synthetics at the time of backup. 2. The method of claim 1 further comprising: automatically generating a recipe for the new file based on the comparison indicating what data in the exiting files duplicates with the new one, wherein the recipe comprises a specific sequence of steps used to generate the new backup file; and replaying the recipe during the replication process of the deduplicated backup system. 3. The method of claim 2 further comprising: generating, for new segments to be backed up, new group fingerprints; determining if any new group fingerprints match the stored group fingerprints; and making, if there is a match resulting in matching fingerprints, a new backup dataset out of segments corresponding to the matching fingerprints, otherwise, making a backup using a per-segment deduplication process for segments corresponding to fingerprints that do not match. 4. The method of claim 3 further comprising storing the new group fingerprints on the storage server for use in a subsequent comparison operation for a next incremental of the second backup. 5. The method of claim 3 wherein the virtual synthetic backup is made by combining data from a current backup using previous backup data already stored on the server, and using Change Block Tracking (CBT) to determine data that has changed between the previous and current backup. 6. The method of claim 5 wherein the recipe is used to generate data of the backup file, and is replayed by replication logic of the deduplicated backup system to create a duplicate backup file on a secondary deduplicated backup system. 7. The method of claim 6 wherein the recipe comprises virtual copy commands along with offset and length information of newly written data. 8. The method of claim 7 wherein the recipe comprises segments corresponding to group fingerprints present in the previous backup data that are added to group fingerprints of the current backup data. 9. The method of claim 8 wherein the recipe has a format comprising: Current File=offset:Basefile offset+length, offset:Basefile offset+length, offset: Basefile+length, and wherein the duplicate offset regions map to duplicate GFPs identified. 10. The method of claim 1 wherein the signature for each respective data segment is generated using a cryptographic hash function, and wherein the fingerprints are stored in a L0 to L6 layered segment tree, and further wherein the group fingerprints are grouped using a defined grouping algorithm, the method further comprising: obtaining a hint from a backup client working together with the server to use the hint to identify a file and it's set of group fingerprints to use for the comparing; receiving the hint in the server; and fetching group fingerprints from the server based on the hint. 11. The method of claim 10 wherein the hint constitutes an insight into workflow of the client and the server, and comprises at least one of: backup location information, a filename and path of a previous backup, or other identifying information about one or more previous backups. 12. A system making backups of protected data from a backup client for storage through a storage server method making inline deduplicated backups of protected data using group fingerprints resident in a storage server, comprising: a storage server component generating group fingerprints for a backup client that is not capable of using group fingerprints, from individual fingerprints generated for each segment of protected data divided into variable size segments and then grouped together, wherein each fingerprint comprises a signature for a respective data segment, and maintaining the group fingerprints for files resident on the storage server; a comparator component of the storage server comparing respective group fingerprints for files existing on the storage server and a new file to be backed up from the backup client to determine duplicated data between the new file and existing files; a converter component converting the new backup file to a virtual synthetic backup during a backup time of the new; and a further storage server component automatically generating a recipe for the new file based on the comparison indicating what data in the exiting files duplicates with the new one, wherein the recipe comprises a specific sequence of steps used to generate the new backup file; and replaying the recipe during a replication process of a deduplicated backup system comprising a Data Domain file system (DDFS) and a Data Domain Bandwidth Optimized Open Storage Technology (DDBoost) library that links with the application to reduce bandwidth required for data ingests, and which translates application read and write request to DDBoost application program interfaces (APIs). 13. The system of claim 12 wherein the backup data comprises data formed by a full backup followed by one or more incremental, and wherein the storage server component further generates, for new segments to be backed up, new group fingerprints, determines if any new group fingerprints match the stored group fingerprints, and makes, if there is a match resulting in matching fingerprints, a new backup dataset out of segments corresponding to the matching fingerprints, otherwise, makes a backup using a per-segment deduplication process for segments corresponding to fingerprints that do not match, and further stores the new group fingerprints on the storage server for use in a subsequent comparison operation for a next incremental backup. 14. The system of claim 13 wherein the virtual synthetic backup is made by combining data from a current backup using previous backup data already stored on the server, and using Change Block Tracking (CBT) to determine data that has changed between the previous and current backup. 15. The system of claim 14 wherein the recipe is used to generate data of the backup file, and is replayed by replication logic of the deduplicated backup system to create a duplicate backup file on a secondary deduplicated backup system, and wherein the recipe comprises virtual copy commands along with offset and length i

Assignees

Inventors

Classifications

  • using de-duplication of the data · CPC title

  • using file system or storage system metadata · CPC title

  • for networked environments · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12222821B2 cover?
Making inline deduplicated backups of protected data using group fingerprints resident in a storage server by generating group fingerprints on a storage server for a backup client that is not capable of using group fingerprints, from individual fingerprints generated for each segment of protected data divided into variable size segments and then grouped together. Each fingerprint comprises a si…
Who is the assignee on this patent?
Dell Products Lp
What technology area does this patent fall under?
Primary CPC classification G06F11/1453. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 11 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).