Content-aware lossless compression and decompression of floating point data

US9660666B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-9660666-B1
Application numberUS-201414579130-A
CountryUS
Kind codeB1
Filing dateDec 22, 2014
Priority dateDec 22, 2014
Publication dateMay 23, 2017
Grant dateMay 23, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Lossless content-aware compression and decompression techniques are provided for floating point data, such as seismic data. A minimum-length compression technique exploits an association between an exponent and a length of the significand, which corresponds to the position of the least significant bit of the significand. A reduced number of bits from the significand can then be stored. A prediction method is also optionally previously applied, so that residual values with shorter lengths are compressed instead of the original values. An alignment compression technique exploits repetition patterns in the floating point numbers when they are aligned to the same exponent. Floating point numbers are then split into integral and fractional parts. The fractional part is separately encoded using a dictionary-based compression method, while the integral part is compressed using a delta-encoding method. A prediction method is also optionally previously applied to the integral part, thereby increasing the compression ratio.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for compressing at least one floating point number, said method comprising: obtaining said at least one floating point number represented using one or more bits to indicate a sign of said at least one floating point number and one or more additional exponent bits to indicate an exponent at a given base and a significand of said at least one floating point number; determining a length of said significand equal to a number of bits between a most significant bit of said significand and a least significant bit of said significand having a predefined binary value; and encoding said at least one floating point number by encoding, as a single code, an exponent/length pair comprising said exponent and said length, wherein one or more significant bits following said least significant bit of said significand having said predefined binary value are not included in said encoding. 2. The method of claim 1 , further comprising the step of selecting a plurality of said exponent/length pairs based on a total number of said additional significant bits, wherein the total number of said additional significant bits for a particular exponent/length pair is obtained as a function of a frequency of occurrence of said particular exponent/length pair and the number of said additional significant bits for said particular exponent/length pair. 3. The method of claim 2 , further comprising the step of encoding additional exponent/length pairs not in said plurality of exponent/length pairs using additional bits relative to an encoding of said plurality of exponent/length pairs. 4. The method of claim 1 , wherein said one or more additional significant bits following said least significant bit of said significand having said predefined binary value are discarded as unnecessary data. 5. The method of claim 1 , further comprising the step of decompressing said compressed at least one floating point number by restoring discarded bits based on said encoded length. 6. The method of claim 1 , wherein said compressed at least one floating point number is one or more of stored, transmitted and processed. 7. The method of claim 1 , wherein said at least one floating point number comprise seismic data. 8. The method of claim 1 , further comprising the steps of, prior to said steps of obtaining, determining and encoding, applying a linear prediction algorithm to said at least one floating point number to generate a prediction, aligning the prediction to said exponent of said at least one floating point number, truncating the prediction to end at a bit immediately before a least significant bit having said predefined binary value of said at least one floating point number, generating a corresponding prediction error with a binary length potentially equal to or lower than a length of said at least one floating point number; and then employing said steps of obtaining, determining and encoding to selectively compress said prediction error or the corresponding at least one floating point number based on a number of bits to be represented and providing a differentiation bit to indicate whether said prediction error or the corresponding at least one floating point number is encoded. 9. The method of claim 1 , wherein said method of claim 1 is selectively applied based on a data analysis of said obtained at least one floating point number. 10. The method of claim 1 , wherein a first portion of said encoded at least one floating point number is decoded without decoding additional portions of said encoded at least one floating point number. 11. A computer program product comprising a tangible machine-readable storage medium having encoded therein executable code of one or more software programs, wherein the one or more software programs when executed perform the steps of the method of claim 1 . 12. A method for compressing at least one floating point number, said method comprising: obtaining said at least one floating point number represented using one or more bits to indicate a sign of said at least one floating point number and one or more additional exponent bits to indicate an exponent at a given base and a significand of said at least one floating point number; aligning said at least one floating point number by moving a radix point of said at least one floating point number and updating said exponent so that said at least one floating point number is separated into an integral portion and a fractional portion, wherein a plurality of said fractional portions of said at least one floating point number comprises at least one repeating pattern; encoding said integral portion separately; and encoding said fractional portion of said at least one floating point number using a dictionary-based compression method. 13. The method of claim 12 , wherein said dictionary-based compression method comprises a run-length encoding method. 14. The method of claim 12 , wherein a given one of said at least one repeating pattern is encoded using a block code that replaces said at least one repeating pattern with a symbol indicating said given repeating pattern and a number of repetitions of said given repeating pattern. 15. The method of claim 12 , wherein a plurality of said fractional portions of said at least one floating point number comprises at least one repeating pattern comprised of at least two alternating symbols and wherein a given one of said at least one repeating pattern is encoded using a mega block code that replaces said at least two alternating symbols with a first reference to a location of a first symbol in said dictionary, a second reference to a location of a second symbol in said dictionary, a total number of blocks with a same repeating said first and second symbol, and a sequence of block sizes representing a number of times each of said first and second symbol appear in a same order as said first and second symbol appear in said repeating pattern, along with metadata indicating a number of bits necessary to store the first and second references and said block sizes. 16. The method of claim 12 , wherein said encoding of said integral portion comprises the steps of applying said integral portion of said at least one floating point number to a delta encoder that generates differences between a current integral portion and a previous integral portion; and encoding said differences using an adaptive Elias Gama entropy encoder. 17. The method of claim 16 , wherein said adaptive Elias Gama entropy encoder transforms said differences into codes of variable length by fitting said integral portion of said at least one floating point number into a plurality of encoding bins, wherein an inferior limit of the encoding bins is increased such that said encoding bins start at a point in which the integral portion of said at least one floating point number substantially fits a geometric distribution and wherein the superior limit is reduced to a point from which a number of said integral portion of said at least one floating point that is represented in an original form, leaving one bin of said encoding bins for the encoding of exceptions, also stored in original form. 18. The method of claim 12 , wherein said encoding of said integral portion comprises the steps of applying said integral portion of said at least one floating point number to a linear prediction algorithm to generate a prediction to said integral portion of said at least one floating point number, aligning the prediction to said exponent of said integral portion of said at least one floating point number, truncating the aligned

Assignees

Inventors

Classifications

  • H03M7/30Primary

    Compression (speech analysis-synthesis for redundancy reduction G10L19/00; for image communication H04N); Expansion; Suppression of unnecessary data, e.g. redundancy reduction · CPC title

  • Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system or floating-point numbers {(G06F7/4806, G06F7/4824, G06F7/49, G06F7/491, G06F7/544 take precedence)} · CPC title

  • Decoder aspects · CPC title

  • Type of the data to be coded, other than image and sound · CPC title

  • for the decoding process only · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9660666B1 cover?
Lossless content-aware compression and decompression techniques are provided for floating point data, such as seismic data. A minimum-length compression technique exploits an association between an exponent and a length of the significand, which corresponds to the position of the least significant bit of the significand. A reduced number of bits from the significand can then be stored. A predic…
Who is the assignee on this patent?
Emc Ip Holding Co Llc
What technology area does this patent fall under?
Primary CPC classification H03M7/30. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue May 23 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).