Assigning outlier-related classifications to traffic flows across multiple time windows
US-12027044-B2 · Jul 2, 2024 · US
US9401967B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9401967-B2 |
| Application number | US-79703210-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 9, 2010 |
| Priority date | Jun 9, 2010 |
| Publication date | Jul 26, 2016 |
| Grant date | Jul 26, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems for performing inline wire speed data deduplication are described herein. Some embodiments include a device for inline data deduplication that includes one or more input ports for receiving an input data stream containing duplicates, one or more output ports for providing a data deduplicated output data stream, and an inline data deduplication engine coupled to one or more input ports and one or more output ports to process input data containing duplicates into output data which is data deduplicated, where the inline data deduplication engine has an inline data deduplication bandwidth of at least 4 Gigabytes per second.
Opening claim text (preview).
What is claimed is: 1. A device for inline data deduplication, comprising: one or more input ports for receiving an input data stream containing duplicates; one or more output ports for providing a data deduplicated output data stream; and an inline data deduplication engine coupled to said one or more input ports and said one or more output ports to process input data containing duplicates into output data which is data deduplicated, said inline data deduplication engine having an inline data deduplication bandwidth of at least 4 Gigabytes per second, wherein said inline data deduplication engine comprises: frame memory comprising at least some of the received input data stream and at least some output data provided for inclusion in the output data stream; chunking logic for subdividing input data extracted from the input data stream into input data chunks; chunk identifier logic for generating a chunk identifier for each of the input data chunks based at least in part upon data within the input data chunk, wherein each chunk identifier is uniquely associated with a particular sequence of chunk data; and one or more data compression engines each comprising: a plurality of hash memories each associated with a different lane of a plurality of lanes, and each lane comprising data bytes from at least one of the input data chunks; an array comprising array elements each comprising a plurality of validity bits, wherein each validity bit within an array element corresponds to a different lane of the plurality of lanes; control logic, coupled to the plurality of hash memories and the array, that initiates a read of a hash memory entry if a corresponding validity bit indicates that said entry is valid; and an encoder, coupled to the plurality of hash memories and the control logic, that compresses at least the data bytes for the lane associated with the hash memory comprising the valid entry if said valid entry comprises data that matches the lane data bytes; wherein the one or more data compression engines each operates at least at a rate that is the lower of the bandwidth of an input port of the one or more input ports from which uncompressed data is received and the bandwidth of an output port of the one or more output ports to which compressed data is directed. 2. A device for inline data deduplication, comprising: one or more input ports for receiving an input data stream containing duplicates; one or more output ports for providing a data deduplicated output data stream; and an inline data deduplication engine coupled to said one or more input ports and said one or more output ports to process input data containing duplicates into output data which is data deduplicated, said inline data deduplication engine having an inline data deduplication bandwidth of at least 4 Gigabytes per second, wherein said inline data deduplication engine comprises: frame memory comprising at least some of the received input data stream and at least some output data provided for inclusion in the output data stream; chunking logic for subdividing input data extracted from the input data stream into input data chunks; chunk identifier logic for generating a chunk identifier for each of the input data chunks based at least in part upon data within the input data chunk, wherein each chunk identifier is uniquely associated with a particular sequence of chunk data; Bloom filter logic for identifying as non-matching data chunks at least some input data chunks that do not match any previously processed data chunks already provided as part of the output data stream; Bloom filter array memory for storing Bloom filter status bits; and processing logic for identifying non-matching data chunks not already identified by the Bloom filter, and for controlling the inclusion within the output data stream of the non-matching data chunks identified by the Bloom filter and the processing logic; wherein the identification of non-matching data chunks by the Bloom filter and the processing logic is based at least in part on the chunk identifier. 3. The device of claim 2 , wherein said inline data deduplication engine further comprises a Bloom filter cache memory comprising at least some of the Bloom filter status bits most recently accessed by the Bloom filter logic; and wherein if a first input/output (I/O) operation to access a first Bloom filter status bit stored within the Bloom filter cache memory is followed by a second I/O operation to access the same first Bloom filter status bit or to access a second Bloom filter status bit stored within the Bloom filter cache memory, the second I/O operation will not be held off pending completion of the first I/O operation. 4. A device for inline data deduplication, comprising: one or more input ports for receiving an input data stream containing duplicates; one or more output ports for providing a data deduplicated output data stream; and an inline data deduplication engine coupled to said one or more input ports and said one or more output ports to process input data containing duplicates into output data which is data deduplicated, said inline data deduplication engine having an inline data deduplication bandwidth of at least 4 Gigabytes per second, wherein said inline data deduplication engine comprises: frame memory comprising at least some of the received input data stream and at least some output data provided for inclusion in the output data stream; chunking logic for subdividing input data extracted from the input data stream into input data chunks; chunk identifier logic for generating a chunk identifier for each of the input data chunks based at least in part upon data within the input data chunk, wherein each chunk identifier is uniquely associated with a particular sequence of chunk data; a content addressable storage (CAS) hash index table, at least part of the chunk identifier being used as an index to locate a pointer within the CAS hash index table; and wherein the pointer, if valid, points to groups of one or more CAS entries corresponding to the index, each of the one or more CAS entries comprising a second pointer to a metadata record describing a non-matching data chunk that does not match any previously processed data chunks already provided as part of the output data stream, and further comprising any remaining chunk identifier bits not used as the index. 5. The device of claim 4 , wherein a matching input data chunk is identified if a CAS entry is found that corresponds to an index derived from the chunk identifier of the matching input data chunk, and that includes remaining chunk identifier bits that match the corresponding remaining chunk identifier bits of the matching input data chunk. 6. The device of claim 4 , wherein said inline data deduplication engine further comprises CAS cache memory; and wherein at least some of the one or more CAS entries most recently accessed by said inline data deduplication engine are stored within the CAS cache memory. 7. The device of claim 6 , wherein a collection of adjacent groups of CAS entries are read into the CAS cache memory; and wherein at least some of the CAS entries read into the CAS cache memory describe related non-matching data chunks. 8. The device of claim 4 , wherein said inline data deduplication engine further comprises metadata cache memory; wherein at least some metadata records most recently accessed by said inline data deduplication engine are stored in the metadata cache as part of one or more metadata pages; and wherein at least some metadata records within one of the one or more metadata pages describe related non-matching data chunks. 9. A data deduplication method performed by an inline dedupli
Aggregation; Duplicate elimination · CPC title
Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor · CPC title
using compression, e.g. sparse files · CPC title
Data stream processing; Continuous queries · CPC title
De-duplication techniques · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.