Data compression using jointly trained encoder, decoder, and prior neural networks
US-2021004677-A1 · Jan 7, 2021 · US
US12443343B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12443343-B2 |
| Application number | US-202418663060-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 13, 2024 |
| Priority date | Jan 5, 2002 |
| Publication date | Oct 14, 2025 |
| Grant date | Oct 14, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Data compaction using codebooks, that applies delta encoding methods to entropy encoding methods to improve data compaction of entropy encoding methods under certain conditions and when compacting data having certain characteristics. Delta encoding may be applied to entropy encoding methods to further compact data sets by reducing the number of sourceblocks included in a codebook to those most commonly encountered in data to be encoded and, where mismatches occur during encoding, using delta encoding of bit differences with existing sourceblocks in the codebook rather than adding new sourceblocks to the codebook.
Opening claim text (preview).
What is claimed is: 1. A computing system for data compaction using delta encoding, comprising: one or more hardware processors configured for: receiving a data stream for compaction, the data stream comprising sourceblocks of data; for each sourceblock of the data stream: calculating a hash of the sourceblock as a sourceblock approximation using the hash function; retrieving an approximation codeword for the sourceblock approximation by looking up the calculated hash of the sourceblock; inserting the approximation codeword into a primary data stream; and sending the sourceblock and the approximation codeword to a delta encoder; and for each sourceblock and approximation codeword received from the primary encoder: retrieving a sourceblock hash using the approximation codeword; unhashing the sourceblock hash using an inverse of the hash function to obtain a sourceblock approximation; calculating a delta value between the sourceblock and the sourceblock approximation; retrieving a delta codeword for the delta value; and inserting the delta codeword into a delta data stream. 2. The computing system of claim 1 , wherein the hash function is a MinHash function. 3. The computing system of claim 1 , wherein the processors are further configured for: receiving the primary data stream comprising approximation codewords and the delta data stream comprising corresponding delta codewords; for each approximation codeword in the encoded data stream: retrieving the sourceblock hash for that codeword; and unhashing the sourceblock hash using an inverse of the hash function to obtain a sourceblock approximation; for each delta codeword in the encoded data stream, retrieving the delta value for that delta codeword; and for each sourceblock approximation and its corresponding delta value, applying the delta value to the sourceblock approximation to obtain the sourceblock approximated by the sourceblock approximation, and inserting the sourceblock into a decoded data stream. 4. A method for multiple pass data compaction using delta encoding, comprising the steps of: receiving a data stream for compaction, the data stream comprising sourceblocks of data; for each sourceblock of the data stream: calculating a hash of the sourceblock as a sourceblock approximation using the hash function; retrieving an approximation codeword for the sourceblock approximation by looking up the calculated hash of the sourceblock; inserting the approximation codeword into a primary data stream; and sending the sourceblock and the approximation codeword to a delta encoder; and for each sourceblock and approximation codeword received from the primary encoder: retrieving a sourceblock hash using the approximation codeword; unhashing the sourceblock hash using an inverse of the hash function to obtain a sourceblock approximation; calculating a delta value between the sourceblock and the sourceblock approximation; retrieving a delta codeword for the delta value; and inserting the delta codeword into a delta data stream. 5. The method of claim 4 , wherein the hash function is a MinHash function. 6. The method of claim 4 , further comprising the steps of: receiving the primary data stream comprising approximation codewords and the delta data stream comprising corresponding delta codewords; for each approximation codeword in the encoded data stream: retrieving the sourceblock hash for that codeword; and unhashing the sourceblock hash using an inverse of the hash function to obtain a sourceblock approximation; for each delta codeword in the encoded data stream, retrieving the delta value for that delta codeword; and for each sourceblock approximation and its corresponding delta value, applying the delta value to the sourceblock approximation to obtain the sourceblock approximated by the sourceblock approximation, and inserting the sourceblock into a decoded data stream. 7. Non-transitory, computer-readable storage media having computer-executable instructions embodied thereon that, when executed by one or more processors of a computing system for multiple pass data compaction using delta encoding, cause the computing system to: receive a data stream for compaction, the data stream comprising sourceblocks of data; for each sourceblock of the data stream: calculate a hash of the sourceblock as a sourceblock approximation using the hash function; retrieve an approximation codeword for the sourceblock approximation by looking up the calculated hash of the sourceblock; insert the approximation codeword into a primary data stream; and send the sourceblock and the approximation codeword to a delta encoder; and for each sourceblock and approximation codeword received from the primary encoder: retrieve a sourceblock hash using the approximation codeword; unhash the sourceblock hash using an inverse of the hash function to obtain a sourceblock approximation; calculate a delta value between the sourceblock and the sourceblock approximation; retrieve a delta codeword for the delta value; and insert the delta codeword into a delta data stream. 8. The non-transitory, computer-readable storage media of claim 7 , wherein the hash function is a MinHash function. 9. The non-transitory, computer-readable storage media of claim 7 , wherein the computing system is further caused to: receive the primary data stream comprising approximation codewords and the delta data stream comprising corresponding delta codewords; for each approximation codeword in the encoded data stream: retrieve the sourceblock hash for that codeword; and unhash the sourceblock hash using an inverse of the hash function to obtain a sourceblock approximation; for each delta codeword in the encoded data stream, retrieve the delta value for that delta codeword; and for each sourceblock approximation and its corresponding delta value, apply the delta value to the sourceblock approximation to obtain the sourceblock approximated by the sourceblock approximation, and insert the sourceblock into a decoded data stream.
Encoder aspects · CPC title
Decoder aspects · CPC title
Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS] · CPC title
Command handling arrangements, e.g. command buffers, queues, command scheduling · CPC title
in relation to content · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.