Inline wire speed deduplication system

US10417233B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10417233-B2
Application numberUS-201615194331-A
CountryUS
Kind codeB2
Filing dateJun 27, 2016
Priority dateJun 9, 2010
Publication dateSep 17, 2019
Grant dateSep 17, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems for performing inline wire speed data deduplication are described herein. Some embodiments include a device for inline data deduplication that includes one or more input ports for receiving an input data stream containing duplicates, one or more output ports for providing a data deduplicated output data stream, and an inline data deduplication engine coupled to said one or more input ports and said one or more output ports to process input data containing duplicates into output data which is data deduplicated, said inline data deduplication engine having an inline data deduplication bandwidth of at least 4 Gigabytes per second.

First claim

Opening claim text (preview).

What is claimed is: 1. A device for inline data deduplication, comprising: one or more input ports for receiving an input data stream containing duplicates; one or more output ports for providing a data deduplicated output data stream; and an inline data deduplication engine coupled to said one or more input ports and said one or more output ports to process input data containing duplicates into output data which is data deduplicated, said inline data deduplication engine having an inline data deduplication bandwidth of at least 4 Gigabytes per second. 2. The device of claim 1 , wherein said inline data deduplication engine: extracts input data from the input data stream; subdivides the extracted input data into input data chunks, the input data comprising duplicate input data chunks; identifies input data chunks that do not match any previously processed data chunks already provided to the destination device; and provides non-matching input data chunks to an output port of the one or more output ports as deduplicated data for inclusion in the output data stream. 3. The device of claim 1 , wherein said inline data deduplication engine comprises: frame memory comprising at least some of the received input data stream and at least some output data provided for inclusion in the output data stream; chunking logic for subdividing input data extracted from the input data stream into input data chunks; and chunk identifier logic for generating a chunk identifier for each of the input data chunks based at least in part upon data within the input data chunk; wherein each chunk identifier is uniquely associated with a particular sequence of chunk data. 4. The device of claim 3 , wherein the input data chunks are variable in size, and wherein the size of each input data chunk depends upon the content of the extracted input data. 5. The device of claim 3 , wherein said chunking logic processes the input data stream into input data chunks at a data rate of at least 400 Megabytes per second. 6. A data deduplication method performed by an inline deduplication engine, the method comprising: receiving an input data stream containing duplicates; providing a data deduplicated output data stream; and processing input data containing duplicates into output data which is data deduplicated, said processing being performed at a rate of at least 4 Gigabytes per second. 7. The method of claim 6 , further comprising: extracting input data from the input data stream; subdividing the extracted input data into input data chunks, the input data comprising duplicate input data chunks; identifying input data chunks that do not match any previously processed data chunks already provided to the destination device; and providing non-matching input data chunks to an output port of the one or more output ports as deduplicated data for inclusion in the output data stream. 8. The method of claim 6 , further comprising: subdividing input data extracted from an input data stream into input data chunks; and generating a chunk identifier for each of the input data chunks based at least in part upon data within the input data chunk, each chunk identifier uniquely associated with a particular sequence of chunk data. 9. The method of claim 8 , wherein the input data chunks are variable in size, and wherein the size of each input data chunk depends upon the content of the extracted input data. 10. The method of claim 8 , wherein said subdividing the extracted input data is performed at a rate of at least 4 Megabytes per second. 11. A device for inline data deduplication, comprising: one or more input ports for receiving an input data stream containing duplicates; one or more output ports for providing a data deduplicated output data stream; and an inline data deduplication engine coupled to said one or more input ports and said one or more output ports to process input data containing duplicates into output data which is data deduplicated, said inline data deduplication engine having an inline data deduplication bandwidth of at least 400 Megabytes per second per input port. 12. The device of claim 11 , wherein said inline data deduplication engine: extracts input data from the input data stream; subdivides the extracted input data into input data chunks, the input data comprising duplicate input data chunks; identifies input data chunks that do not match any previously processed data chunks already provided to the destination device; and provides non-matching input data chunks to an output port of the one or more output ports as deduplicated data for inclusion in the output data stream. 13. The device of claim 11 , wherein said inline data deduplication engine comprises: frame memory comprising at least some of the received input data stream and at least some output data provided for inclusion in the output data stream; chunking logic for subdividing input data extracted from the input data stream into input data chunks; and chunk identifier logic for generating a chunk identifier for each of the input data chunks based at least in part upon data within the input data chunk; wherein each chunk identifier is uniquely associated with a particular sequence of chunk data. 14. The device of claim 13 , wherein the input data chunks are variable in size, and wherein the size of each input data chunk depends upon the content of the extracted input data. 15. The device of claim 13 , wherein said chunking logic processes the input data stream into input data chunks at a data rate of at least 400 Megabytes per second. 16. A data deduplication method performed by an inline data deduplication engine, the method comprising: receiving an input data stream containing duplicates; providing a data deduplicated output data stream; and processing input data containing duplicates into output data which is data deduplicated, said processing being performed at a rate of at least 400 Megabytes per second per input port of the inline data deduplication engine. 17. The method of claim 16 , further comprising: extracting input data from the input data stream; subdividing the extracted input data into input data chunks, the input data comprising duplicate input data chunks; identifying input data chunks that do not match any previously processed data chunks already provided to the destination device; and providing non-matching input data chunks to an output port of the one or more output ports as deduplicated data for inclusion in the output data stream. 18. The method of claim 16 , further comprising: subdividing input data extracted from an input data stream into input data chunks; and generating a chunk identifier for each of the input data chunks based at least in part upon data within the input data chunk, each chunk identifier uniquely associated with a particular sequence of chunk data. 19. The method of claim 18 , wherein the input data chunks are variable in size, and wherein the size of each input data chunk depends upon the content of the extracted input data. 20. The method of claim 18 , wherein said subdividing the extracted input data is performed at a rate of at least 4 Megabytes per second.

Assignees

Inventors

Classifications

  • De-duplication techniques · CPC title

  • Data stream processing; Continuous queries · CPC title

  • Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor · CPC title

  • Aggregation; Duplicate elimination · CPC title

  • using compression, e.g. sparse files · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10417233B2 cover?
Systems for performing inline wire speed data deduplication are described herein. Some embodiments include a device for inline data deduplication that includes one or more input ports for receiving an input data stream containing duplicates, one or more output ports for providing a data deduplicated output data stream, and an inline data deduplication engine coupled to said one or more input po…
Who is the assignee on this patent?
Avago Tech Int Sales Pte Lid
What technology area does this patent fall under?
Primary CPC classification G06F16/24556. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 17 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).