Dynamic multithreaded cache allocation

US9529719B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9529719-B2
Application numberUS-201213567066-A
CountryUS
Kind codeB2
Filing dateAug 5, 2012
Priority dateAug 5, 2012
Publication dateDec 27, 2016
Grant dateDec 27, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Apparatus and method embodiments for dynamically allocating cache space in a multi-threaded execution environment are disclosed. In some embodiments, a processor includes a cache shared by each of a plurality of processor cores and/or each of a plurality of threads executing on the processor. The processor further includes a cache allocation circuit configured to dynamically allocate space in the cache provided to each of the plurality of processor cores based on their respective usage patterns. The cache allocation unit may track cache usage by each of the processor cores/threads using subsets of usage bits and counters configured to update states of the usage bits. The cache allocation circuit may track the usage of cache space by the processor cores/threads and may allocate more space to those that exhibit more usage of the cache.

First claim

Opening claim text (preview).

What is claimed is: 1. A processor comprising: a cache shared by a plurality of processor cores; and a cache allocation circuit configured to dynamically allocate space in the cache provided to each of the plurality of processor cores based on usage patterns for each of the plurality of processor cores; wherein the cache includes a plurality of indexes each associated with a plurality of usage bits, wherein the plurality of usage bits for each index includes, for each of the plurality of processor cores, a corresponding subset of usage bits indicative of usage of the cache at that index by that processor core; wherein the cache further includes a plurality of ways in each of the plurality of indexes, wherein the cache allocation circuit is configured to allocate an equal number of ways in each index to each of the plurality of processor cores, and to allocate a predetermined number of ways in each index as being usable by any of the plurality of processor cores; and wherein the cache allocation circuit includes a plurality of counters, wherein each of the plurality of counters corresponds to a unique one of the plurality of processor cores, wherein each of the plurality of counters is configured to, responsive to an access of the cache by its corresponding processor core at a given index, update a value indicated by the subset of usage bits corresponding to that processor core and the given index. 2. The processor as recited in claim 1 , wherein each of the plurality of counters is configured to increment a value of its corresponding subset of usage bits at the given index responsive to its respective processor core accessing the cache at the given index. 3. The processor as recited in claim 2 , wherein each of the plurality of counters is a saturating counter having a maximum count and a minimum count and wherein each subset of usage bits has a maximum value equal to the maximum count and minimum value equal to the minimum count, and wherein the cache allocation circuit is configured to, responsive to a cache access by a processor core at a given index in which a count of a corresponding subset of usage bits is saturated at its maximum count, decrement a count for each remaining subset of usage bits corresponding to that index. 4. The processor as recited in claim 3 , wherein the cache allocation circuit is configured to replace information in a cache line within a given index and allocated to a given processor core responsive to a cache miss within the given index when a count of a subset of usage bits corresponding to the given processor core and given index is saturated at the minimum count. 5. The processor as recited in claim 1 , wherein each of the plurality of counters is configured to increment a count of a subset of usage bits responsive to its respective processor core registering a hit during an access of the cache at an index corresponding to the subset of usage bits. 6. The processor as recited in claim 5 , wherein each of the plurality of counters is a saturating counter having a maximum count and a minimum count and wherein each subset of usage bits has a maximum value equal to the maximum count and minimum value equal to the minimum count, and wherein the cache allocation circuit is configured to, responsive to a cache access by a processor core registering a hit at a given index in which a count of a corresponding subset of usage bits is saturated at its maximum count, decrement a count for each remaining subset of usage bits corresponding to that index. 7. A method comprising: executing, on each of a plurality of processor cores, a corresponding one of plurality of threads; accessing a cache shared by each of the plurality of processor cores during execution of each of the plurality of threads; and dynamically allocating space in the cache provided to each of the plurality processor cores based on usage patterns by each of the plurality of processor cores; wherein the cache includes a plurality of indexes each associated with a corresponding plurality of usage bits, and for a given one of the plurality of indexes, indicating usage of the cache by a given one of the plurality of processor cores using a corresponding subset of the plurality of usage bits; wherein the cache includes a plurality of ways in each of the plurality of indexes, a cache allocation circuit allocating an equal number of ways in each index to each of the plurality of processor cores and the cache allocation unit designating a predetermined number of ways in each index as being usable by any of the plurality of processor cores; and wherein the cache allocation circuit includes a plurality of counters, each corresponding to a unique one of the plurality of processor cores, each of the plurality of counters, responsive to an access of the cache by its corresponding processor core at a given index, updating a value indicated by the subset of usage bits corresponding to that processor core and the given index. 8. The method as recited in claim 7 , further comprising the counter incrementing the subset of the plurality of usage bits responsive to the given one of the plurality of processor cores accessing the cache at the given one of the plurality of indexes. 9. The method as recited in claim 8 , wherein the counter is a saturating counter having a maximum count and a minimum count and wherein each subset of usage bits has a maximum value equal to the maximum count and a minimum value equal to the minimum count, and wherein the method further comprises decrementing counts for each subset of usage bits associated with each of the other cores responsive to a cache access by the given one of the processor cores when the count of the usage bits at that given index and associated with the given one of the processor cores is saturated at its maximum count. 10. The method as recited in claim 9 , further comprising replacing information in a cache line within the given one of the plurality of indexes and allocated to the given one of the plurality of processor cores responsive to a cache miss in that index when the count of the subset of usage bits is saturated at the minimum count. 11. The method as recited in claim 7 , further comprising incrementing the count of the subset of usage bits associated with the given one of the plurality of processor cores responsive to the given one of the plurality of processor cores registering a hit during an access of the cache at the given one of the plurality of indexes. 12. The method as recited in claim 11 , wherein the counter is a saturating counter having a maximum count and a minimum count and wherein each subset of usage bits has a maximum value equal to the maximum count and a minimum value equal to the minimum count, and wherein the method further comprises decrementing counts for each subset of usage bits at the given index associated with each of the other cores responsive to a cache access by the given one of the processor cores registering a hit when the count of its corresponding subset of usage bits is saturated at its maximum count. 13. The method as recited in claim 7 , wherein said dynamically allocating is performed by a cache allocation circuit. 14. A processor comprising: a cache having a plurality of ways; and a cache controller circuit configured to dynamically allocate ways in the cache to each of a plurality of threads executing on the processor based on usage of the cache by each of the plurality of threads; and wherein: the cache further includes a plurality of indexes each associated with a corresponding plurality of usage bits, the plurality of usage bits is divided into subsets each correspond

Assignees

Inventors

Classifications

  • G06F12/084Primary

    with a shared cache · CPC title

  • G06F12/023Primary

    Free address space management · CPC title

  • Allocation of cache space to multiple users or processors · CPC title

  • Correctness of operation, e.g. memory ordering · CPC title

  • the resource being the memory · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9529719B2 cover?
Apparatus and method embodiments for dynamically allocating cache space in a multi-threaded execution environment are disclosed. In some embodiments, a processor includes a cache shared by each of a plurality of processor cores and/or each of a plurality of threads executing on the processor. The processor further includes a cache allocation circuit configured to dynamically allocate space in t…
Who is the assignee on this patent?
Walker William L, Advanced Micro Devices Inc
What technology area does this patent fall under?
Primary CPC classification G06F12/084. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 27 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).