Method and apparatus for compressing and decompressing sparse data sets

US11989414B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11989414-B2
Application numberUS-202318340464-A
CountryUS
Kind codeB2
Filing dateJun 23, 2023
Priority dateMar 4, 2022
Publication dateMay 21, 2024
Grant dateMay 21, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments of the present disclosure include a digital circuit and method for multi-stage compression. Digital data values are compressed using a multi-stage compression algorithm and stored in a memory. A decompression circuit receives the values and performs a partial decompression. The partially compressed values are provided to a processor, which performs the final decompression. In one embodiment, a vector of N length compressed values are decompressed using a first bit mask into two N length sets having non-zero values. The two N length sets are further decompressed using two M length bit masks into M length sparse vectors, each having non-zero values.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer system comprising: a processor; and memory, wherein the computer system is configured to perform data compression, the data compression comprising: receiving a first set of data and a second set of data; selecting non-zero data values from the first set of data and the second set of data; generating a first sparsified data set and a second sparsified data set comprising the selected non-zero data values, wherein other data values in the first and second sparsified data sets are zero values; generating a first bit mask and a second bit mask, the first and second bit masks specifying positions of the non-zero data values in the first and second sparsified data sets; generating a third sparsified data set and a fourth sparsified data set comprising the selected non-zero data values, wherein the third sparsified data set has a number of data values less than the first sparsified data set and comprises the non-zero values from the first sparsified data set, and wherein the fourth sparsified data set has a number of data values less than the second sparsified data set and comprises the non-zero values from the second sparsified data set; generating a third bit mask specifying positions of the non-zero values in the third and fourth sparsified data sets; generating a compressed data set comprising the non-zero values from the third and fourth sparsified data sets; and storing the compressed data set and the first, second, and third bit masks in the memory. 2. The circuit of claim 1 , wherein selecting non-zero data values comprises selecting adjacent pairs of non-zero data values. 3. The circuit of claim 1 , wherein selecting non-zero of data values comprises selecting a subset of largest values from the first and second data sets. 4. The circuit of claim 3 , wherein first and second bit masks have a length equal to a number of zero and non-zero data values in the first and second sparsified data sets. 5. The circuit of claim 1 , wherein the first and second sparsified data sets comprise a multibit mantissa, a sign bit, and a shared exponent. 6. The circuit of claim 1 , wherein a number of values in the third sparsified data set and a number of values in the fourth sparsified data set is two times the number of selected non-zero of data values. 7. The circuit of claim 1 , wherein the third bit mask has a length equal to two times a number of data values in the third and fourth sparsified data sets. 8. A method of compressing data comprising: receiving a first set of data and a second set of data; selecting non-zero data values from the first set of data and the second set of data; generating a first sparsified data set and a second sparsified data set comprising the selected non-zero data values, wherein other data values in the first and second sparsified data sets are zero values; generating a first bit mask and a second bit mask, the first and second bit masks specifying positions of the non-zero data values in the first and second sparsified data sets; generating a third sparsified data set and a fourth sparsified data set comprising the selected non-zero data values, wherein the third sparsified data set has a number of data values less than the first sparsified data set and comprises the non-zero values from the first sparsified data set, and wherein the fourth sparsified data set has a number of data values less than the second sparsified data set and comprises the non-zero values from the second sparsified data set; generating a third bit mask specifying positions of the non-zero values in the third and fourth sparsified data sets; generating a compressed data set comprising the non-zero values from the third and fourth sparsified data sets; and storing the compressed data set and the first, second, and third bit masks in a memory. 9. The method of claim 8 , wherein selecting non-zero data values comprises selecting adjacent pairs of non-zero data values. 10. The method of claim 8 , wherein selecting non-zero of data values comprises selecting a subset of largest values from the first and second data sets. 11. The method of claim 8 , wherein the first and second sparsified data sets comprise a multibit mantissa, a sign bit, and a shared exponent. 12. The method of claim 8 , wherein a number of values in the third sparsified data set and a number of values in the fourth sparsified data set is two times the number of selected non-zero of data values. 13. The method of claim 8 , wherein the third bit mask has a length equal to two times a number of data values in the third and fourth sparsified data sets. 14. A non-transitory machine-readable medium storing a program executable by a computer, the program comprising sets of instructions for: receiving a first set of data and a second set of data; selecting non-zero data values from the first set of data and the second set of data; generating a first sparsified data set and a second sparsified data set comprising the selected non-zero data values, wherein other data values in the first and second sparsified data sets are zero values; generating a first bit mask and a second bit mask, the first and second bit masks specifying positions of the non-zero data values in the first and second sparsified data sets; generating a third sparsified data set and a fourth sparsified data set comprising the selected non-zero data values, and wherein the third sparsified data set has a number of data values less than the first sparsified data set and comprises the non-zero values from the first sparsified data set, and wherein the fourth sparsified data set has a number of data values less than the second sparsified data set and comprises the non-zero values from the second sparsified data set; generating a third bit mask specifying positions of the non-zero values in the third and fourth sparsified data sets; generating a compressed data set comprises the non-zero values from the third and fourth sparsified data sets; and storing the compressed data set and the first, second, and third bit masks in a memory. 15. The non-transitory machine-readable medium of claim 14 , wherein selecting non-zero data values comprises selecting adjacent pairs of non-zero data values. 16. The non-transitory machine-readable medium of claim 14 , wherein selecting non-zero of data values comprises selecting a subset of largest values from the first and second data sets. 17. The non-transitory machine-readable medium of claim 16 , wherein first and second bit masks have a length equal to a number of zero and non-zero data values in the first and second sparsified data sets. 18. The non-transitory machine-readable medium of claim 14 , wherein the first and second sparsified data sets comprise a multibit mantissa, a sign bit, and a shared exponent. 19. The non-transitory machine-readable medium of claim 14 , wherein a number of values in the third sparsified data set and a number of values in the fourth sparsified data set is two times the number of selected non-zero of data values. 20. The non-transitory machine-readable medium of claim 14 , wherein the third bit mask has a length equal to two times a number of data values in the third and fourth sparsified data sets.

Assignees

Inventors

Classifications

  • G06F3/0608Primary

    Saving storage space on storage systems · CPC title

  • Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices · CPC title

  • Non-volatile semiconductor memory device, e.g. flash memory, one time programmable memory [OTP] · CPC title

  • H03M7/3066Primary

    by means of a mask or a bit-map · CPC title

  • Decoder aspects · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11989414B2 cover?
Embodiments of the present disclosure include a digital circuit and method for multi-stage compression. Digital data values are compressed using a multi-stage compression algorithm and stored in a memory. A decompression circuit receives the values and performs a partial decompression. The partially compressed values are provided to a processor, which performs the final decompression. In one em…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06F3/0608. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 21 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).