Optimized pipeline to boost de-dup system performance

US11809282B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11809282-B2
Application numberUS-202017037336-A
CountryUS
Kind codeB2
Filing dateSep 29, 2020
Priority dateSep 29, 2020
Publication dateNov 7, 2023
Grant dateNov 7, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A deduplication pipeline method to enable shorter overall latency, servicing of multiple calls in parallel, and implementing higher data compression ratio. The method includes receiving user data for storage, performing deduplication operation on the data to obtain non-duplicative data, buffering the non-duplicative data in persistent memory, and accepting next user data for deduplication processing. In parallel to receiving the next user data, operating a co-processor to asynchronously compressing the data stored in the persistent memory and storing the compressed data in RAID.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for processing deduplication backup operations, comprising: receiving, by a processor, user data for backup; processing, by the processor, protocol of the user data; performing, by the processor, deduplication filtering to identify duplication in the data; when the filtering identifies non-duplicated data to be stored, buffering the non-duplicated data in uncompressed format in a persistent memory (PMEM); upon completion of buffering the non-duplicated data in the uncompressed format in the PMEM, sending, by the processor, a confirmation to a protocol layer to notify the protocol layer to process next user data; and receiving, by the processor, the next user data for backup; and, while processing, by the processor, protocol and filtering of the next user data, compressing, by a co-processor, the non-duplicated data to be stored. 2. The method of claim 1 , wherein buffering the non-duplicated data comprises storing the non-duplicated data in the uncompressed format. 3. The method of claim 2 , wherein buffering the non-duplicated data comprises storing the non-duplicated data in the persistent memory (PMEM). 4. The method of claim 3 , wherein storing the non-duplicated data in the persistent memory comprises writing the data using direct access. 5. The method of claim 1 , wherein compressing the non-duplicated data comprises using the co-processor to perform the compression. 6. The method of claim 5 , further comprising storing compressed data output by the co-processor in a disk array. 7. The method of claim 6 , further comprising prior to storing the compressed data, packing and sealing the compressed data. 8. A system for deduplicating user data for storage, comprising: a processor executing a protocol layer and a filter layer; a persistent memory (PMEM) residing on memory bus and buffering uncompressed data from the filter layer in uncompressed format in the PMEM; upon completion of buffering the non-duplicated data in the uncompressed format in the PMEM, the processor sending a confirmation to the protocol layer to notify the protocol layer to process next user data; receiving the next user data for backup; and, while the processor processing protocol and filtering of the next user data, a co-processor compressing the non-duplicated data to be stored; a storage device storing compressed data from the co-processor. 9. The system of claim 8 , wherein the co-processor compresses the uncompressed data asynchronously to the processor executing the protocol layer and the filter layer. 10. The system of claim 9 , wherein the co-processor comprises QuickAssist Technology. 11. The system of claim 10 , wherein the persistent memory buffers uncompressed data using direct access. 12. The system of claim 11 , wherein the uncompressed data from the filter layer comprises non-duplicated data. 13. The system of claim 12 , wherein the storage device comprises redundant array of independent disks (RAID). 14. The system of claim 13 , further comprising a packing layer organizing the compressed data for storage in the RAID. 15. A computer program product comprising a non-transitory computer-readable medium having a computer-readable program code embodied therein to be executed by one or more processors and one or more co-processors coupled to a persistent memory residing on memory bus, the program code including instructions to: receive, by a processor, user data for backup; operate the processor to process protocol of the user data; operate the processor to perform deduplication filtering to identify duplication in the user data; operate the persistent memory to buffer non-duplicated data obtained from the deduplication filtering in uncompressed format in a persistent memory (PMEM); upon completion of buffering the non-duplicated data in the uncompressed format in the PMEM, operate the processor to send a confirmation to a protocol layer to notify the protocol layer to process next user data; and receive, by the processor, the next user data for backup; and, while processing, by the processor, protocol and filtering of the next user data, compress, by a co-processor, the non-duplicated data to be stored. 16. The computer program product of claim 15 , wherein the program code includes further instructions to buffer the non-duplicated data using direct access (DAX) mode. 17. The computer program product of claim 16 , wherein the program code includes further instructions to store the compressed data in a storage device. 18. The computer program product of claim 17 , wherein the storage device comprises redundant array of independent disks (RAID) and the program code includes further instructions to pack the compressed data for storage in the RAID. 19. A computer-implemented method for deduplicating user data for storage, comprising performing operations, the operations comprising: receiving, by a processor, a first transmission of user data for backup; extracting, by the processor, from the first transmission of user data non-duplicative data; buffering the non-duplicative data in uncompressed format in a persistent memory (PMEM); upon completing buffering the non-duplicative data in the uncompressed format in the PMEM, sending, by the processor, a confirmation to a protocol layer to notify the protocol layer to process next user data, and receiving, by the processor, the next user data for backup; and, while processing, by the processor, protocol and filtering of the next user data, compressing, by a co-processor, the non-duplicated data to be stored. 20. The computer-implemented method of claim 19 , wherein the buffering is executed by direct write mode into persistent memory and the compressing is executed by the co-processor.

Assignees

Inventors

Classifications

  • using de-duplication of the data · CPC title

  • Saving storage space on storage systems · CPC title

  • G06F3/0611Primary

    in relation to response time · CPC title

  • in relation to data integrity, e.g. data losses, bit errors · CPC title

  • De-duplication techniques · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11809282B2 cover?
A deduplication pipeline method to enable shorter overall latency, servicing of multiple calls in parallel, and implementing higher data compression ratio. The method includes receiving user data for storage, performing deduplication operation on the data to obtain non-duplicative data, buffering the non-duplicative data in persistent memory, and accepting next user data for deduplication proce…
Who is the assignee on this patent?
Emc Ip Holding Co Llc
What technology area does this patent fall under?
Primary CPC classification G06F11/1453. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 07 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).