Methods and apparatus for rational compression and decompression of numbers
US-9337863-B1 · May 10, 2016 · US
US9660666B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-9660666-B1 |
| Application number | US-201414579130-A |
| Country | US |
| Kind code | B1 |
| Filing date | Dec 22, 2014 |
| Priority date | Dec 22, 2014 |
| Publication date | May 23, 2017 |
| Grant date | May 23, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Lossless content-aware compression and decompression techniques are provided for floating point data, such as seismic data. A minimum-length compression technique exploits an association between an exponent and a length of the significand, which corresponds to the position of the least significant bit of the significand. A reduced number of bits from the significand can then be stored. A prediction method is also optionally previously applied, so that residual values with shorter lengths are compressed instead of the original values. An alignment compression technique exploits repetition patterns in the floating point numbers when they are aligned to the same exponent. Floating point numbers are then split into integral and fractional parts. The fractional part is separately encoded using a dictionary-based compression method, while the integral part is compressed using a delta-encoding method. A prediction method is also optionally previously applied to the integral part, thereby increasing the compression ratio.
Opening claim text (preview).
What is claimed is: 1. A method for compressing at least one floating point number, said method comprising: obtaining said at least one floating point number represented using one or more bits to indicate a sign of said at least one floating point number and one or more additional exponent bits to indicate an exponent at a given base and a significand of said at least one floating point number; determining a length of said significand equal to a number of bits between a most significant bit of said significand and a least significant bit of said significand having a predefined binary value; and encoding said at least one floating point number by encoding, as a single code, an exponent/length pair comprising said exponent and said length, wherein one or more significant bits following said least significant bit of said significand having said predefined binary value are not included in said encoding. 2. The method of claim 1 , further comprising the step of selecting a plurality of said exponent/length pairs based on a total number of said additional significant bits, wherein the total number of said additional significant bits for a particular exponent/length pair is obtained as a function of a frequency of occurrence of said particular exponent/length pair and the number of said additional significant bits for said particular exponent/length pair. 3. The method of claim 2 , further comprising the step of encoding additional exponent/length pairs not in said plurality of exponent/length pairs using additional bits relative to an encoding of said plurality of exponent/length pairs. 4. The method of claim 1 , wherein said one or more additional significant bits following said least significant bit of said significand having said predefined binary value are discarded as unnecessary data. 5. The method of claim 1 , further comprising the step of decompressing said compressed at least one floating point number by restoring discarded bits based on said encoded length. 6. The method of claim 1 , wherein said compressed at least one floating point number is one or more of stored, transmitted and processed. 7. The method of claim 1 , wherein said at least one floating point number comprise seismic data. 8. The method of claim 1 , further comprising the steps of, prior to said steps of obtaining, determining and encoding, applying a linear prediction algorithm to said at least one floating point number to generate a prediction, aligning the prediction to said exponent of said at least one floating point number, truncating the prediction to end at a bit immediately before a least significant bit having said predefined binary value of said at least one floating point number, generating a corresponding prediction error with a binary length potentially equal to or lower than a length of said at least one floating point number; and then employing said steps of obtaining, determining and encoding to selectively compress said prediction error or the corresponding at least one floating point number based on a number of bits to be represented and providing a differentiation bit to indicate whether said prediction error or the corresponding at least one floating point number is encoded. 9. The method of claim 1 , wherein said method of claim 1 is selectively applied based on a data analysis of said obtained at least one floating point number. 10. The method of claim 1 , wherein a first portion of said encoded at least one floating point number is decoded without decoding additional portions of said encoded at least one floating point number. 11. A computer program product comprising a tangible machine-readable storage medium having encoded therein executable code of one or more software programs, wherein the one or more software programs when executed perform the steps of the method of claim 1 . 12. A method for compressing at least one floating point number, said method comprising: obtaining said at least one floating point number represented using one or more bits to indicate a sign of said at least one floating point number and one or more additional exponent bits to indicate an exponent at a given base and a significand of said at least one floating point number; aligning said at least one floating point number by moving a radix point of said at least one floating point number and updating said exponent so that said at least one floating point number is separated into an integral portion and a fractional portion, wherein a plurality of said fractional portions of said at least one floating point number comprises at least one repeating pattern; encoding said integral portion separately; and encoding said fractional portion of said at least one floating point number using a dictionary-based compression method. 13. The method of claim 12 , wherein said dictionary-based compression method comprises a run-length encoding method. 14. The method of claim 12 , wherein a given one of said at least one repeating pattern is encoded using a block code that replaces said at least one repeating pattern with a symbol indicating said given repeating pattern and a number of repetitions of said given repeating pattern. 15. The method of claim 12 , wherein a plurality of said fractional portions of said at least one floating point number comprises at least one repeating pattern comprised of at least two alternating symbols and wherein a given one of said at least one repeating pattern is encoded using a mega block code that replaces said at least two alternating symbols with a first reference to a location of a first symbol in said dictionary, a second reference to a location of a second symbol in said dictionary, a total number of blocks with a same repeating said first and second symbol, and a sequence of block sizes representing a number of times each of said first and second symbol appear in a same order as said first and second symbol appear in said repeating pattern, along with metadata indicating a number of bits necessary to store the first and second references and said block sizes. 16. The method of claim 12 , wherein said encoding of said integral portion comprises the steps of applying said integral portion of said at least one floating point number to a delta encoder that generates differences between a current integral portion and a previous integral portion; and encoding said differences using an adaptive Elias Gama entropy encoder. 17. The method of claim 16 , wherein said adaptive Elias Gama entropy encoder transforms said differences into codes of variable length by fitting said integral portion of said at least one floating point number into a plurality of encoding bins, wherein an inferior limit of the encoding bins is increased such that said encoding bins start at a point in which the integral portion of said at least one floating point number substantially fits a geometric distribution and wherein the superior limit is reduced to a point from which a number of said integral portion of said at least one floating point that is represented in an original form, leaving one bin of said encoding bins for the encoding of exceptions, also stored in original form. 18. The method of claim 12 , wherein said encoding of said integral portion comprises the steps of applying said integral portion of said at least one floating point number to a linear prediction algorithm to generate a prediction to said integral portion of said at least one floating point number, aligning the prediction to said exponent of said integral portion of said at least one floating point number, truncating the aligned
Compression (speech analysis-synthesis for redundancy reduction G10L19/00; for image communication H04N); Expansion; Suppression of unnecessary data, e.g. redundancy reduction · CPC title
Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system or floating-point numbers {(G06F7/4806, G06F7/4824, G06F7/49, G06F7/491, G06F7/544 take precedence)} · CPC title
Decoder aspects · CPC title
Type of the data to be coded, other than image and sound · CPC title
for the decoding process only · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.