Method and apparatus to use DRAM as a cache for slow byte-addressible memory for efficient cloud applications
US-12174739-B2 · Dec 24, 2024 · US
US9529719B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9529719-B2 |
| Application number | US-201213567066-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 5, 2012 |
| Priority date | Aug 5, 2012 |
| Publication date | Dec 27, 2016 |
| Grant date | Dec 27, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Apparatus and method embodiments for dynamically allocating cache space in a multi-threaded execution environment are disclosed. In some embodiments, a processor includes a cache shared by each of a plurality of processor cores and/or each of a plurality of threads executing on the processor. The processor further includes a cache allocation circuit configured to dynamically allocate space in the cache provided to each of the plurality of processor cores based on their respective usage patterns. The cache allocation unit may track cache usage by each of the processor cores/threads using subsets of usage bits and counters configured to update states of the usage bits. The cache allocation circuit may track the usage of cache space by the processor cores/threads and may allocate more space to those that exhibit more usage of the cache.
Opening claim text (preview).
What is claimed is: 1. A processor comprising: a cache shared by a plurality of processor cores; and a cache allocation circuit configured to dynamically allocate space in the cache provided to each of the plurality of processor cores based on usage patterns for each of the plurality of processor cores; wherein the cache includes a plurality of indexes each associated with a plurality of usage bits, wherein the plurality of usage bits for each index includes, for each of the plurality of processor cores, a corresponding subset of usage bits indicative of usage of the cache at that index by that processor core; wherein the cache further includes a plurality of ways in each of the plurality of indexes, wherein the cache allocation circuit is configured to allocate an equal number of ways in each index to each of the plurality of processor cores, and to allocate a predetermined number of ways in each index as being usable by any of the plurality of processor cores; and wherein the cache allocation circuit includes a plurality of counters, wherein each of the plurality of counters corresponds to a unique one of the plurality of processor cores, wherein each of the plurality of counters is configured to, responsive to an access of the cache by its corresponding processor core at a given index, update a value indicated by the subset of usage bits corresponding to that processor core and the given index. 2. The processor as recited in claim 1 , wherein each of the plurality of counters is configured to increment a value of its corresponding subset of usage bits at the given index responsive to its respective processor core accessing the cache at the given index. 3. The processor as recited in claim 2 , wherein each of the plurality of counters is a saturating counter having a maximum count and a minimum count and wherein each subset of usage bits has a maximum value equal to the maximum count and minimum value equal to the minimum count, and wherein the cache allocation circuit is configured to, responsive to a cache access by a processor core at a given index in which a count of a corresponding subset of usage bits is saturated at its maximum count, decrement a count for each remaining subset of usage bits corresponding to that index. 4. The processor as recited in claim 3 , wherein the cache allocation circuit is configured to replace information in a cache line within a given index and allocated to a given processor core responsive to a cache miss within the given index when a count of a subset of usage bits corresponding to the given processor core and given index is saturated at the minimum count. 5. The processor as recited in claim 1 , wherein each of the plurality of counters is configured to increment a count of a subset of usage bits responsive to its respective processor core registering a hit during an access of the cache at an index corresponding to the subset of usage bits. 6. The processor as recited in claim 5 , wherein each of the plurality of counters is a saturating counter having a maximum count and a minimum count and wherein each subset of usage bits has a maximum value equal to the maximum count and minimum value equal to the minimum count, and wherein the cache allocation circuit is configured to, responsive to a cache access by a processor core registering a hit at a given index in which a count of a corresponding subset of usage bits is saturated at its maximum count, decrement a count for each remaining subset of usage bits corresponding to that index. 7. A method comprising: executing, on each of a plurality of processor cores, a corresponding one of plurality of threads; accessing a cache shared by each of the plurality of processor cores during execution of each of the plurality of threads; and dynamically allocating space in the cache provided to each of the plurality processor cores based on usage patterns by each of the plurality of processor cores; wherein the cache includes a plurality of indexes each associated with a corresponding plurality of usage bits, and for a given one of the plurality of indexes, indicating usage of the cache by a given one of the plurality of processor cores using a corresponding subset of the plurality of usage bits; wherein the cache includes a plurality of ways in each of the plurality of indexes, a cache allocation circuit allocating an equal number of ways in each index to each of the plurality of processor cores and the cache allocation unit designating a predetermined number of ways in each index as being usable by any of the plurality of processor cores; and wherein the cache allocation circuit includes a plurality of counters, each corresponding to a unique one of the plurality of processor cores, each of the plurality of counters, responsive to an access of the cache by its corresponding processor core at a given index, updating a value indicated by the subset of usage bits corresponding to that processor core and the given index. 8. The method as recited in claim 7 , further comprising the counter incrementing the subset of the plurality of usage bits responsive to the given one of the plurality of processor cores accessing the cache at the given one of the plurality of indexes. 9. The method as recited in claim 8 , wherein the counter is a saturating counter having a maximum count and a minimum count and wherein each subset of usage bits has a maximum value equal to the maximum count and a minimum value equal to the minimum count, and wherein the method further comprises decrementing counts for each subset of usage bits associated with each of the other cores responsive to a cache access by the given one of the processor cores when the count of the usage bits at that given index and associated with the given one of the processor cores is saturated at its maximum count. 10. The method as recited in claim 9 , further comprising replacing information in a cache line within the given one of the plurality of indexes and allocated to the given one of the plurality of processor cores responsive to a cache miss in that index when the count of the subset of usage bits is saturated at the minimum count. 11. The method as recited in claim 7 , further comprising incrementing the count of the subset of usage bits associated with the given one of the plurality of processor cores responsive to the given one of the plurality of processor cores registering a hit during an access of the cache at the given one of the plurality of indexes. 12. The method as recited in claim 11 , wherein the counter is a saturating counter having a maximum count and a minimum count and wherein each subset of usage bits has a maximum value equal to the maximum count and a minimum value equal to the minimum count, and wherein the method further comprises decrementing counts for each subset of usage bits at the given index associated with each of the other cores responsive to a cache access by the given one of the processor cores registering a hit when the count of its corresponding subset of usage bits is saturated at its maximum count. 13. The method as recited in claim 7 , wherein said dynamically allocating is performed by a cache allocation circuit. 14. A processor comprising: a cache having a plurality of ways; and a cache controller circuit configured to dynamically allocate ways in the cache to each of a plurality of threads executing on the processor based on usage of the cache by each of the plurality of threads; and wherein: the cache further includes a plurality of indexes each associated with a corresponding plurality of usage bits, the plurality of usage bits is divided into subsets each correspond
with a shared cache · CPC title
Free address space management · CPC title
Allocation of cache space to multiple users or processors · CPC title
Correctness of operation, e.g. memory ordering · CPC title
the resource being the memory · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.