Offloading client-side deduplication operations using a data processing unit

US2024256491A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2024256491-A1
Application numberUS-202318160148-A
CountryUS
Kind codeA1
Filing dateJan 26, 2023
Priority dateJan 26, 2023
Publication dateAug 1, 2024
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments for performing the inline deduplication by filtering streaming data as it is received by a backup client through a backup server executing a backup process. A data processing unit (DPU) is deployed to offload certain processing operations performed by a central processing unit (CPU) of the backup client. An inline deduplication operation comprises file operations, data segmentation, segment fingerprinting, compression, and encryption prior to storage in a backup target. The DPU is deployed and configured to perform the compression and encryption steps, the entire inline deduplication stack, or the entire inline deduplication stack plus the file system operations.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer-implemented method of optimizing client-side inline deduplication of backup data, comprising: performing the inline deduplication by filtering streaming data as it is received by a backup client through a backup server executing a backup process; performing, in a central processing unit (CPU) of the backup client, a segmentation process to determine where to break the streaming data into a plurality of segments; calculating, in the CPU, a reference for each segment of the plurality of segments; deploying a data processing unit (DPU) functionally coupled to the CPU to perform at least some of the processing performed by the CPU; compressing, in the DPU, each segment; and encrypting, in the DPU, each compressed segment. 2 . The method of claim 1 further comprising storing the encrypted and compressed segments in a storage device coupled to the CPU and DPU. 3 . The method of claim 1 wherein the reference comprises a fingerprint of data calculated using a secure hash algorithm (SHA). 4 . The method of claim 1 wherein the backup process is executed by a data storage server running a Data Domain File System (DDFS). 5 . The method of claim 1 wherein the DPU comprises a hardware compression and encryption accelerator component, and the CPU comprises data buffers, reference buffers, segment buffers executing a distributed segment processing send file loop. 6 . The method of claim 5 wherein the backup client utilizes a Data Domain (DD) Boost application program interface (API) to access a DD Boost library to perform the segmentation and the reference calculating steps. 7 . A computer-implemented method of optimizing client-side inline deduplication of backup data, comprising: performing the inline deduplication by filtering streaming data as it is received by a backup client through a backup server executing a backup process; deploying a data processing unit (DPU) functionally coupled to the CPU to perform at least some of the processing performed by the CPU; performing, in DPU, a segmentation process to determine where to break the streaming data into a plurality of segments; calculating, in the DPU, a reference for each segment of the plurality of segments; compressing, in the DPU, each segment; and encrypting, in the DPU, each compressed segment. 8 . The method of claim 7 further comprising storing the encrypted and compressed segments in a storage device coupled to the CPU and DPU. 9 . The method of claim 7 wherein the reference comprises a fingerprint of data calculated using a secure hash algorithm (SHA). 10 . The method of claim 9 wherein the DPU comprises data buffers, reference buffers, segment buffers executing a distributed segment processing send file loop and further comprises a hardware compression and encryption accelerator component, and a checksum SHA hash accelerator component. 11 . The method of claim 7 wherein the backup process is executed by a data storage server running a Data Domain File System (DDFS). 12 . The method of claim 11 wherein the CPU comprises a DD protocol layer managing client resources and translating backend processing into application consumable application program interfaces (APIs). 13 . The method of claim 12 wherein the DPU utilizes a Data Domain (DD) Boost application program interface (API) to access a DD Boost library to perform the segmentation and the reference calculating steps. 14 . A computer-implemented method of optimizing client-side inline deduplication of backup data, comprising: deploying a Data Domain (DD) Boost file system (FS) interface (API) to access a DD Boost library on a host computer hosting one or more applications generating the backup data; deploying a data processing unit (DPU) functionally coupled to the CPU to perform at least some of the processing performed by the CPU; performing, in the DPU, one or more filesystem operations accessing the backup data through the DD Boost API; performing, in DPU, a segmentation process to determine where to break the streaming data into a plurality of segments; calculating, in the DPU, a reference for each segment of the plurality of segments; compressing, in the DPU, each segment; and encrypting, in the DPU, each compressed segment. 15 . The method of claim 14 further comprising storing the encrypted and compressed segments in a storage device coupled to the host computer and DPU. 16 . The method of claim 14 wherein the reference comprises a fingerprint of data calculated using a secure hash algorithm (SHA). 17 . The method of claim 14 wherein the DPU comprises data buffers, reference buffers, segment buffers executing a distributed segment processing send file loop and further comprises a hardware compression and encryption accelerator component, and a checksum SHA hash accelerator component. 18 . The method of claim 17 wherein the host further comprises a DD protocol layer managing client resources and translating backend processing into application consumable application program interfaces (APIs). 19 . The method of claim 18 wherein the backup process is executed by a data storage server running a Data Domain File System (DDFS). 20 . The method of claim 19 wherein the DPU further a Data Domain (DD) Boost application program interface (API) to access the DD Boost library and DD Boost FS to process the backup data of the application programs.

Assignees

Inventors

Classifications

  • Hardware arrangements for backup · CPC title

  • for networked environments · CPC title

  • using de-duplication of the data · CPC title

  • based on file chunks · CPC title

  • G06F16/137Primary

    Hash-based (content-based indexing of textual data G06F16/31) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2024256491A1 cover?
Embodiments for performing the inline deduplication by filtering streaming data as it is received by a backup client through a backup server executing a backup process. A data processing unit (DPU) is deployed to offload certain processing operations performed by a central processing unit (CPU) of the backup client. An inline deduplication operation comprises file operations, data segmentation,…
Who is the assignee on this patent?
Dell Products Lp
What technology area does this patent fall under?
Primary CPC classification G06F16/1752. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Aug 01 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).