Apparatus and method for determining a sector division ratio of a shared cache memory
US-2015339229-A1 · Nov 26, 2015 · US
US12056059B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12056059-B2 |
| Application number | US-202217590362-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 1, 2022 |
| Priority date | Mar 15, 2019 |
| Publication date | Aug 6, 2024 |
| Grant date | Aug 6, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods for cache utilization are disclosed. In one embodiment, a graphics processor includes processing resources to perform graphics operations and a cache controller of a cache memory that is coupled to the processing resources. The cache controller is configured to set an initial aging policy using an aging field based on age of cache lines within the cache memory and to determine whether a hint or an instruction to indicate a level of aging has been received. In one embodiment, the cache memory configured to be partitioned into multiple cache regions, wherein the multiple cache regions include a first cache region having a cache eviction policy with a configurable level of data persistence.
Opening claim text (preview).
What is claimed is: 1. A graphics processing unit comprising: a graphics multiprocessor comprising: a plurality of processing resources including a first set of processing cores and a second set of processing cores, wherein the first set of processing cores includes dedicated tensor processing circuitry to perform a plurality of matrix operations in response to an instruction and the second set of processing cores includes circuitry to execute instructions to perform integer and floating-point operations; and a cache memory coupled with the plurality of processing resources, wherein the cache memory is a graphics processor cache memory that is configured to be partitioned into multiple cache regions, the multiple cache regions include a first cache region having a cache eviction policy with a configurable level of data persistence, and the first cache region is to be configured to a specified level of data persistence in response to an input including the specified level of data persistence. 2. The graphics processing unit as in claim 1 , wherein the input includes a hint associated with an instruction. 3. The graphics processing unit as in claim 1 , wherein cache memory is configured to be partitioned based on workloads processed by the plurality of processing resources. 4. The graphics processing unit as in claim 1 , wherein the cache memory is a level 2 (L2) cache memory. 5. The graphics processing unit as in claim 1 , wherein the multiple cache regions include a second cache region having a default cache eviction policy. 6. The graphics processing unit as in claim 5 , wherein the first cache region is associated with a first region of memory and the second cache region is associated with a second region of memory, the first cache region to be configured in response to the input including the specified level of data persistence. 7. The graphics processing unit as in claim 1 , wherein the circuitry to execute instructions to perform integer and floating-point operations includes first circuitry to perform integer operations, second circuitry to perform 32-bit floating-point operations, and third circuitry to perform 64-bit floating-point operations. 8. A method comprising: executing a first instruction on a graphics multiprocessor of a graphics processing unit to perform a plurality of matrix operations on dedicated tensor processing circuitry of a first set of processing cores of the graphics multiprocessor; executing a second instruction on the graphics multiprocessor to perform an integer operation on a second set of processing cores; executing a third instruction on the graphics multiprocessor to perform a floating-point operation on the second set of processing cores; partitioning a cache memory of the graphics processing unit into multiple cache regions, wherein the cache memory is a graphics processor cache memory that is coupled with the first set of processing cores and the second set of processing cores, and the multiple cache regions include a first cache region having a cache eviction policy with a configurable level of data persistence; caching data associated with the first instruction, second instruction, or third instruction within the first cache region of the cache memory based on a workload associated with the respective instruction; and configuring the first cache region to a specified level of data persistence in response to an input including the specified level of data persistence. 9. The method as in claim 8 , wherein the input includes a hint associated with a fourth instruction. 10. The method as in claim 8 , wherein the cache memory is a level 2 (L2) cache memory. 11. The method as in claim 8 , wherein the multiple cache regions include a second cache region having a default cache eviction policy. 12. The method as in claim 11 , further comprising configuring the first cache region to cache data associated with a first region of memory and configuring the second cache region to cache data associated with a second region of memory, the first cache region configured in response to the input including the specified level of data persistence. 13. The method as in claim 8 , further comprising: executing the second instruction via first circuitry of the second set of processing cores that is configured to perform integer operations; executing the third instruction via second circuitry of the second set of processing cores that is configured to perform 32-bit floating-point operations; and executing a fourth instruction via third circuitry of the second set of processing cores that is configured to perform 64-bit floating-point operations. 14. A data processing system comprising: a memory device; and a graphics processing unit (GPU) coupled with the memory device, the GPU including a graphics multiprocessor, the graphics multiprocessor comprising: a plurality of processing resources including a first set of processing cores and a second set of processing cores, wherein the first set of processing cores includes dedicated tensor processing circuitry perform a plurality of matrix operations in response to an instruction and the second set of processing cores includes circuitry to execute instructions to perform integer and floating-point operations; and a cache memory coupled with the plurality of processing resources, wherein the cache memory is a graphics processor cache memory that is configured to be partitioned into multiple cache regions, the multiple cache regions include a first cache region having a cache eviction policy with a configurable level of data persistence, and the first cache region is configurable to a specified level of data persistence in response to an input including the specified level of data persistence. 15. The data processing system as in claim 14 , wherein the first cache region is configurable to the specified level of data persistence input includes a hint associated with an instruction. 16. The data processing system as in claim 14 , wherein the cache memory is configured to be partitioned based on workloads processed by the plurality of processing resources. 17. The data processing system as in claim 14 , wherein the cache memory is a level 2 (L2) cache memory. 18. The data processing system as in claim 14 , wherein the multiple cache regions include a second cache region having a default cache eviction policy. 19. The data processing system as in claim 18 , wherein the first cache region is associated with a first region of memory and the second cache region is associated with a second region of memory, the first cache region to be configured in response to the input including the specified level of data persistence. 20. The data processing system as in claim 14 , wherein the circuitry to execute instructions to perform integer and floating-point operations includes first circuitry to perform integer operations, second circuitry to perform 32-bit floating-point operations, and third circuitry to perform 64-bit floating-point operations. 21. The data processing system as in claim 14 , wherein the input includes an instruction. 22. The graphics processing unit as in claim 1 , wherein the input includes an instruction. 23. The method as in claim 8 , wherein the input includes a fourth instruction.
Memory management · CPC title
In image processor or graphics adapter · CPC title
using clearing, invalidating or resetting means · CPC title
with dedicated cache, e.g. instruction or stack · CPC title
with special data handling, e.g. priority of data or instructions, handling errors or pinning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.