Systems and methods for performing memory compression
US-10331558-B2 · Jun 25, 2019 · US
US12182018B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12182018-B2 |
| Application number | US-202017133615-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 23, 2020 |
| Priority date | Dec 23, 2020 |
| Publication date | Dec 31, 2024 |
| Grant date | Dec 31, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods and apparatus relating to an instruction and/or micro-architecture support for decompression on core are described. In an embodiment, decode circuitry decodes a decompression instruction into a first micro operation and a second micro operation. The first micro operation causes one or more load operations to fetch data into one or more cachelines of a cache of a processor core. Decompression Engine (DE) circuitry decompresses the fetched data from the one or more cachelines of the cache of the processor core in response to the second micro operation. Other embodiments are also disclosed and claimed.
Opening claim text (preview).
The invention claimed is: 1. An apparatus comprising: decode circuitry to decode a decompression instruction into a first micro operation and a second micro operation, wherein the first micro operation is to cause one or more load operations to fetch data into one or more cachelines of a cache of a processor core; and Decompression Engine (DE) circuitry to decompress the fetched data from the one or more cachelines of the cache of the processor core in response to the second micro operation, wherein the DE circuitry is to signal the processor core on completion of decompression of every cacheline. 2. The apparatus of claim 1 , wherein the decompression instruction comprises a first operand to indicate a location of compressed data to be decompressed by the DE circuitry and a second operand to indicate a size of the compressed data to be decompressed by the DE circuitry. 3. The apparatus of claim 2 , wherein the decompression instruction comprises a third operand to indicate a location to which decompressed data by the DE circuitry is to be stored and a fourth operand to indicate a size of the decompressed data. 4. The apparatus of claim 3 , wherein one or more of the first operand and the third operand comprise a virtual memory address. 5. The apparatus of claim 1 , wherein the second micro operation comprises a macro store operation to store the decompressed fetched data into the cache. 6. The apparatus of claim 1 , wherein the cache of the processor core comprises a Level 2 (L2) cache. 7. The apparatus of claim 1 , wherein a consumer bitmap is to indicate which cacheline of the cache corresponds to consumer instructions after completion of decompression of the cacheline. 8. The apparatus of claim 1 , wherein the processor core, DE circuitry, and the cache are on a single integrated circuit die. 9. The apparatus of claim 8 , wherein the processor core comprises a Graphics Processing Unit (GPU) core. 10. The apparatus of claim 1 , wherein the decompression instruction is to cause the DE circuitry to perform an out-of-order decompression of the one or more cachelines. 11. The apparatus of claim 1 , wherein each of the one or more cachelines are 64 Bytes. 12. One or more non-transitory computer-readable media comprising one or more instructions that when executed on at least one processor configure the at least one processor to perform one or more operations to: decode a decompression instruction into a first micro operation and a second micro operation, wherein the first micro operation is to cause one or more load operations to fetch data into one or more cachelines of a cache of a processor core; and cause Decompression Engine (DE) circuitry to decompress the fetched data from the one or more cachelines of the cache of the processor core in response to the second micro operation, wherein the DE circuitry is to signal the processor core on completion of decompression of every cacheline. 13. The one or more computer-readable media of claim 12 , wherein the decompression instruction comprises a first operand to indicate a location of compressed data to be decompressed by the DE circuitry and a second operand to indicate a size of the compressed data to be decompressed by the DE circuitry. 14. The one or more computer-readable media of claim 13 , wherein the decompression instruction comprises a third operand to indicate a location to which decompressed data by the DE circuitry is to be stored and a fourth operand to indicate a size of the decompressed data. 15. The one or more computer-readable media of claim 14 , wherein one or more of the first operand and the third operand comprise a virtual memory address. 16. The one or more computer-readable media of claim 12 , wherein the second micro operation comprises a macro store operation to store the decompressed fetched data into the cache. 17. The one or more computer-readable media of claim 12 , wherein the cache of the processor core comprises a Level 2 (L2) cache. 18. The one or more computer-readable media of claim 12 , further comprising one or more instructions that when executed on the at least one processor configure the at least one processor to perform one or more operations to cause a consumer bitmap to indicate which cacheline of the cache corresponds to consumer instructions after completion of decompression of the cacheline. 19. The one or more computer-readable media of claim 12 , further comprising one or more instructions that when executed on the at least one processor configure the at least one processor to perform one or more operations to cause the DE circuitry to perform an out-of-order decompression of the one or more cachelines. 20. The one or more computer-readable media of claim 12 , wherein each of the one or more cachelines are 64 Bytes. 21. A method comprising: decoding a decompression instruction into a first micro operation and a second micro operation, wherein the first micro operation causes one or more load operations to fetch data into one or more cachelines of a cache of a processor core; and causing Decompression Engine (DE) circuitry to decompress the fetched data from the one or more cachelines of the cache of the processor core in response to the second micro operation, wherein the DE circuitry signals the processor core on completion of decompression of every cacheline. 22. The method of claim 21 , wherein the decompression instruction comprises a first operand to indicate a location of compressed data to be decompressed by the DE circuitry and a second operand to indicate a size of the compressed data to be decompressed by the DE circuitry. 23. The method of claim 22 , wherein the decompression instruction comprises a third operand to indicate a location to which decompressed data by the DE circuitry is to be stored and a fourth operand to indicate a size of the decompressed data. 24. The method of claim 23 , wherein one or more of the first operand and the third operand comprise a virtual memory address. 25. The method of claim 21 , wherein the second micro operation comprises a macro store operation to store the decompressed fetched data into the cache.
Compressed data · CPC title
of parts of caches, e.g. directory or tag array · CPC title
with prefetch · CPC title
Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution · CPC title
Instruction prefetching · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.