Scalable High-Bandwidth Architecture for Lossless Compression
US-2016285473-A1 · Sep 29, 2016 · US
US2016292201A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2016292201-A1 |
| Application number | US-201514672630-A |
| Country | US |
| Kind code | A1 |
| Filing date | Mar 30, 2015 |
| Priority date | Mar 30, 2015 |
| Publication date | Oct 6, 2016 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques are provided for data filtering using hardware accelerators. An apparatus comprises a processor, a memory and a plurality of hardware accelerators. The processor is configured to stream data from the memory to a first one of the hardware accelerators and to receive filtered data from a second one of the hardware accelerators. The plurality of hardware accelerators are configured to filter the streamed data utilizing at least one bit vector partitioned across the plurality of hardware accelerators. The hardware accelerators may be field-programmable gate arrays.
Opening claim text (preview).
1 . An apparatus comprising: a processor; a memory; and a plurality of hardware accelerators; wherein the processor is configured to stream data from the memory to a first one of the hardware accelerators and to receive filtered data from a second one of the hardware accelerators; wherein the plurality of hardware accelerators are configured to filter the streamed data utilizing at least one bit vector partitioned across the plurality of hardware accelerators. 2 . The apparatus of claim 1 , wherein at least one of the hardware accelerators comprises a field-programmable gate array. 3 . The apparatus of claim 1 , wherein the plurality of hardware accelerators form a Bloom filter. 4 . The apparatus of claim 1 , wherein the plurality of hardware accelerators are daisy-chain connected to one another. 5 . The apparatus of claim 4 , wherein the streamed data is forwarded between respective ones of the plurality of hardware accelerators. 6 . The apparatus of claim 4 , wherein each of the plurality of hardware accelerators is configured to utilize one or more hash functions to compute bit vector indices for its corresponding partitioned portion of the at least one bit vector. 7 . The apparatus of claim 4 , wherein a given one of the plurality of hardware accelerators is configured: to utilize one or more hash functions to compute bit vector indices for the at least one bit vector; and to forward the bit vector indices for the at least one bit vector to other ones of the plurality of hardware accelerators. 8 . The apparatus of claim 4 , wherein each of the plurality of hardware accelerators is configured to perform a build phase and a probe phase. 9 . The apparatus of claim 8 , wherein build phase comprises: computing one or more hashes of the streamed data; and updating the at least one bit vector if the computed hashes are within a range of a corresponding partitioned portion of the at least one bit vector. 10 . The apparatus of claim 9 , wherein the probe phase comprises: probing the at least one bit vector if the computed hashes are within the range of the corresponding partitioned portion of the at least one bit vector; generating one or more probed bit values responsive to the probing; and passing the probed bit values to a next hardware accelerator in the daisy chain. 11 . The apparatus of claim 10 , wherein a last one of the hardware accelerators in the daisy chain is configured to filter the streamed data utilizing the probed bit values. 12 . The apparatus of claim 4 , wherein each of the plurality of hardware accelerators is configured: to receive one or more packets each comprising a set of flags and a value, the set of flags comprising a phase flag and two or more match flags; perform one of a build phase and a probe phase responsive to the value of the phase flag. 13 . The apparatus of claim 12 , wherein the build phase comprises programming each of the plurality of hardware accelerators with a corresponding range of said at least one bit vector. 14 . The apparatus of claim 13 , wherein the probe phase comprises: for a first hardware accelerator in the daisy-chain, setting each of the match flags for a given packet to a first value; for each hardware accelerator in the daisy-chain: hashing the value of the given packet using two or more hash functions to compute two or more indices, each index corresponding to a respective one of the match flags; verifying whether each of the two or more indices are within the corresponding range of a current hardware accelerator of the daisy-chain; and for each index within the corresponding range of the current hardware accelerator, modifying the corresponding match flag to a second value; for a last hardware accelerator in the daisy-chain: determining whether each match flag for the given packet is set to the second value; if each match flag for the given packet is set to the second value, streaming the value of the given packet to the processor as filtered data; and if one or more match flags for the given packet is set to the first value, dropping the value of the given packet. 15 . A Bloom filter comprising: a plurality of hardware accelerators; wherein at least one bit vector for the Bloom filter is partitioned across the plurality of hardware accelerators. 16 . The Bloom filter of claim 15 , wherein at least one of the plurality of hardware accelerators comprises a field-programmable gate array. 17 . The Bloom filter of claim 15 , wherein the plurality of hardware accelerators are daisy-chain connected to one another. 18 - 20 . (canceled)
Ensuring data consistency and integrity · CPC title
Query processing with adaptation to specific hardware, e.g. adapted for using GPUs or SSDs · CPC title
Unary operations; Data partitioning operations · CPC title
Physics · mapped topic
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.