Data compression using jointly trained encoder, decoder, and prior neural networks
US-2021004677-A1 · Jan 7, 2021 · US
US12061794B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12061794-B2 |
| Application number | US-202318453335-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 22, 2023 |
| Priority date | Oct 30, 2017 |
| Publication date | Aug 13, 2024 |
| Grant date | Aug 13, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The inventor has conceived, and reduced to practice, a system and method for data compaction using that applies delta encoding methods to entropy encoding methods to improve data compaction of entropy encoding methods under certain conditions and when compacting data having certain characteristics. Delta encoding may be applied to entropy encoding methods to further compact data sets by reducing the number of sourceblocks included in a codebook to those most commonly encountered in data to be encoded and, where mismatches occur during encoding, using delta encoding of bit differences with existing sourceblocks in the codebook rather than adding new sourceblocks to the codebook.
Opening claim text (preview).
What is claimed is: 1. A system for data compaction using delta encoding, comprising: a computing device comprising a processor, a memory, and a non-volatile data storage device; a primary encoder comprising a first plurality of programming instructions stored in the memory which, when operating on the processor, causes the computing device to: receive a data stream for compaction, the data stream comprising sourceblocks of data; for each sourceblock of the data stream: calculate a hash of the sourceblock as a sourceblock approximation using the hash function; retrieve an approximation codeword for the sourceblock approximation by looking up the calculated hash of the sourceblock; insert the approximation codeword into a primary data stream; and send the sourceblock and the approximation codeword to a delta encoder; and the delta encoder comprising a second plurality of programming instructions stored in the memory which, when operating on the processor, causes the computing device to: for each sourceblock and approximation codeword received from the primary encoder: retrieve a sourceblock hash using the approximation codeword; unhash the sourceblock hash using an inverse of the hash function to obtain a sourceblock approximation; calculate a delta value between the sourceblock and the sourceblock approximation; retrieve a delta codeword for the delta value; and insert the delta codeword into a delta data stream. 2. The system of claim 1 , wherein the hash function is a MinHash function. 3. The system of claim 1 , further comprising a decoder comprising a third plurality of programming instructions stored in the memory which, when operating on the processor, causes the computing device to: receive the primary data stream comprising approximation codewords and the delta data stream comprising corresponding delta codewords; for each approximation codeword in the encoded data stream: retrieve the sourceblock hash for that codeword; and unhash the sourceblock hash using an inverse of the hash function to obtain a sourceblock approximation; for each delta codeword in the encoded data stream, retrieve the delta value for that delta codeword; for each sourceblock approximation and its corresponding delta value, apply the delta value to the sourceblock approximation to obtain the sourceblock approximated by the sourceblock approximation, and insert the sourceblock into a decoded data stream. 4. A method for multiple pass data compaction using delta encoding, comprising the steps of: a computing device comprising a processor, a memory, and a non-volatile data storage device; using a primary encoder operating on the computing device to: receive a data stream for compaction, the data stream comprising sourceblocks of data; for each sourceblock of the data stream: calculate a hash of the sourceblock as a sourceblock approximation using the hash function; retrieve an approximation codeword for the sourceblock approximation by using a calculated hash of the sourceblock; insert the approximation codeword into a primary data stream; and send the sourceblock and the approximation codeword to a delta encoder operating on the computing device; and using the delta encoder to: for each sourceblock and approximation codeword received from the primary encoder: retrieve a sourceblock hash using the approximation codeword; unhash the sourceblock hash using an inverse of the hash function to obtain a sourceblock approximation; calculate a delta value between the sourceblock and the sourceblock approximation; retrieve a delta codeword for the delta value; and insert the delta codeword into a delta data stream. 5. The method of claim 4 , wherein the hash function is a MinHash function. 6. The method of claim 4 , further comprising the steps of using a decoder operating on the computing device to: receive the primary data stream comprising approximation codewords and the delta data stream comprising corresponding delta codewords; for each approximation codeword in the encoded data stream: retrieve the sourceblock hash for that codeword; and unhash the sourceblock hash using an inverse of the hash function to obtain a sourceblock approximation; for each delta codeword in the encoded data stream, retrieve the delta value for that delta codeword; for each sourceblock approximation and its corresponding delta value, apply the delta value to the sourceblock approximation to obtain the sourceblock approximated by the sourceblock approximation, and insert the sourceblock into a decoded data stream.
Command handling arrangements, e.g. command buffers, queues, command scheduling · CPC title
Decoder aspects · CPC title
Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS] · CPC title
Encoder aspects · CPC title
in relation to content · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.