Pre-filter check for compressibility using stored compression factors to improve reads in a deduplication file system

US12306721B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12306721-B2
Application numberUS-202318359427-A
CountryUS
Kind codeB2
Filing dateJul 26, 2023
Priority dateJul 26, 2023
Publication dateMay 20, 2025
Grant dateMay 20, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Improving the performance of read operations in a restore path of a backup system by adaptively applying compression. The method defines an extent covering data segments for which compression ratio statistics are calculated, and calculates a respective compression ratio for each data segment in the defined extent. It then associates each unique compression ratio with a corresponding index value and stores each compression ratio and associated corresponding index value in an array. The array is appended as extended file attribute to the data segments, the indexed compression ratio is used by a backup server to determine whether or not to apply compression to the data segments in a restore path sending the data segments from the backup server.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method of optimizing compression for reads in a restore path of a client-side inline deduplication file system, comprising: defining an extent covering data segments for which compression ratio statistics are calculated; calculating a respective compression ratio for each data segment in the defined extent, wherein the compression ratio of a data segment is calculated by dividing an uncompressed size of the data segment by a compressed size of the data segment; associating each unique compression ratio with a corresponding index value; storing each compression ratio and associated corresponding index value in an array; appending the array as extended file attribute to the data segments; using the indexed compression ratio by a backup server to determine whether or not to apply compression to the data segments in a restore path sending the data segments from the backup server, and wherein an index value is calculated by dividing an offset by a size of the extent minus 1; and applying, in a deduplication process of the backup server, the compression to the data segments to reduce an amount of data processed during the reads by the client-side deduplication file system. 2. The method of claim 1 further comprising comparing the compression ratio of the extent with a defined threshold, and applying the compression if the compression ratio exceeds the defined threshold. 3. The method of claim 1 wherein the read operation comprises a restore operation performed by the deduplication backup system executed by a data storage server running a Data Domain File System (DDFS). 4. The method of claim 3 further comprising deploying a Data Domain (DD) Boost file system (FS) interface (API) to access a DDBOOST library on the client hosting one or more applications generating the backup data and to perform segmentation and the reference calculating steps of a deduplication process of the DDFS, wherein the DDBOOST library is extended to the server to allow the server to access one or more functions of the DDFS, and further wherein the DDBOOST FS API presents a standard file system mount point to an application residing on the client, and wherein the application issues a read request to access a buffer in backup storage on the server. 5. The method of claim 1 wherein the compression is performed as part of an adaptive compression process comprising: first tracking file system server CPU usage for a defined period of time; first predicting future server CPU usage based on the tracked usage; first comparing the prediction to a first defined threshold; second tracking, if the predicted future server CPU usage is below the first threshold, client CPU usage for the defined period of time; second predicting future client CPU usage based on the second tracked usage; second comparing the prediction to a second threshold; and applying the adaptive compression to the data, if the predicted future server and client CPU usage both exceed their respective first and second thresholds, otherwise sending the data from the server to the client as non-compressed data. 6. The method of claim 5 wherein the predicted server CPU usage indicates whether or not the server CPU has sufficient resources to perform the compression, and the predicted client CPU usage indicates whether or not the client CPU has sufficient resources to decompress the compressed data sent from the server, both without causing system instability. 7. The method of claim 6 further comprising encoding the predicted client CPU usage as metadata appended to the read request for the second comparing, and wherein the CPU usage comprises a percentage amount of time that the CPU is performing non-idle work, and wherein the defined period of time is divided into a plurality of epochs, each comprising on the order of one second. 8. A computer-implemented method of improving read performance through adaptive compression in a deduplication file system of a data protection network having a backup server, comprising: sending a read request from an application in a client to a server over the network to access backup data stored in a storage target; determining a compressibility of data read by the server to decide whether or not to apply compression to the data in a restore path between the server and client by: calculating a compression ratio of data segments accessed by the read request, comparing the compression ratios to a defined threshold, wherein the compression ratio of a data segment is calculated by dividing an uncompressed size of a data segment by a compressed size of the data segment; performing the compression if the compression ratios exceed the threshold to improve performance of the client-side deduplication file system by reducing an amount of data processed by the data protection network; associating each unique compression ratio with a corresponding index value; and storing each compression ratio and associated corresponding index value in an array; appending the array to the data segments as the extended metadata attribute; defining an extent of data segments covered by the extended metadata attribute, wherein an index value is calculated by dividing an offset by a size of the extent minus 1; and applying, in a deduplication process of the backup server, the compression to the data segment to reduce an amount of data processed during read operations by the client-side deduplication file system. 9. The method of claim 8 further comprising encoding the compression ratio of the data segments is encoded as an extended metadata attribute associated with the data segments. 10. The method of claim 8 wherein the read operation comprises a restore operation performed by the deduplication backup system executed by a data storage server running a Data Domain File System (DDFS). 11. The method of claim 10 further comprising deploying a Data Domain (DD) Boost file system (FS) interface (API) to access a DDBOOST library on the client hosting one or more applications generating the backup data and to perform segmentation and the reference calculating steps of a deduplication process of the DDFS, wherein the DDBOOST library is extended to the server to allow the server to access one or more functions of the DDFS, and further wherein the DDBOOST FS API presents a standard file system mount point to an application residing on the client, and wherein the application issues a read request to access a buffer in backup storage on the server. 12. An apparatus for improving read performance through adaptive compression in a deduplication file system of a data protection network having a client coupled to a server, comprising: a backup server process defining an extent covering data segments for which compression ratio statistics are calculated; a calculator calculating a respective compression ratio for each data segment in the defined extent; a component associating each unique compression ratio with a corresponding index value, storing each compression ratio and associated corresponding index value in an array, and appending the array as extended file attribute to the data segments; a compressor using the indexed compression ratio by a backup server to determine whether or not to apply compression to the data segments in a restore path sending the data segments from the backup server, wherein the compressor compares the compression ratio of the extent with a defined threshold, and applying the compression if the compression ratio exceeds the defined threshold, and wherein the compression ratio of a data segment is calculated by dividing an uncompressed size of the data segment by a compre

Assignees

Inventors

Classifications

  • using compression, e.g. sparse files · CPC title

  • Backup restoration techniques · CPC title

  • Using snapshots, i.e. a logical point-in-time copy of the data · CPC title

  • using de-duplication of the data · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12306721B2 cover?
Improving the performance of read operations in a restore path of a backup system by adaptively applying compression. The method defines an extent covering data segments for which compression ratio statistics are calculated, and calculates a respective compression ratio for each data segment in the defined extent. It then associates each unique compression ratio with a corresponding index value…
Who is the assignee on this patent?
Dell Products Lp
What technology area does this patent fall under?
Primary CPC classification G06F16/1744. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 20 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).