Systems and methods for cache optimization

US12056059B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12056059-B2
Application numberUS-202217590362-A
CountryUS
Kind codeB2
Filing dateFeb 1, 2022
Priority dateMar 15, 2019
Publication dateAug 6, 2024
Grant dateAug 6, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for cache utilization are disclosed. In one embodiment, a graphics processor includes processing resources to perform graphics operations and a cache controller of a cache memory that is coupled to the processing resources. The cache controller is configured to set an initial aging policy using an aging field based on age of cache lines within the cache memory and to determine whether a hint or an instruction to indicate a level of aging has been received. In one embodiment, the cache memory configured to be partitioned into multiple cache regions, wherein the multiple cache regions include a first cache region having a cache eviction policy with a configurable level of data persistence.

First claim

Opening claim text (preview).

What is claimed is: 1. A graphics processing unit comprising: a graphics multiprocessor comprising: a plurality of processing resources including a first set of processing cores and a second set of processing cores, wherein the first set of processing cores includes dedicated tensor processing circuitry to perform a plurality of matrix operations in response to an instruction and the second set of processing cores includes circuitry to execute instructions to perform integer and floating-point operations; and a cache memory coupled with the plurality of processing resources, wherein the cache memory is a graphics processor cache memory that is configured to be partitioned into multiple cache regions, the multiple cache regions include a first cache region having a cache eviction policy with a configurable level of data persistence, and the first cache region is to be configured to a specified level of data persistence in response to an input including the specified level of data persistence. 2. The graphics processing unit as in claim 1 , wherein the input includes a hint associated with an instruction. 3. The graphics processing unit as in claim 1 , wherein cache memory is configured to be partitioned based on workloads processed by the plurality of processing resources. 4. The graphics processing unit as in claim 1 , wherein the cache memory is a level 2 (L2) cache memory. 5. The graphics processing unit as in claim 1 , wherein the multiple cache regions include a second cache region having a default cache eviction policy. 6. The graphics processing unit as in claim 5 , wherein the first cache region is associated with a first region of memory and the second cache region is associated with a second region of memory, the first cache region to be configured in response to the input including the specified level of data persistence. 7. The graphics processing unit as in claim 1 , wherein the circuitry to execute instructions to perform integer and floating-point operations includes first circuitry to perform integer operations, second circuitry to perform 32-bit floating-point operations, and third circuitry to perform 64-bit floating-point operations. 8. A method comprising: executing a first instruction on a graphics multiprocessor of a graphics processing unit to perform a plurality of matrix operations on dedicated tensor processing circuitry of a first set of processing cores of the graphics multiprocessor; executing a second instruction on the graphics multiprocessor to perform an integer operation on a second set of processing cores; executing a third instruction on the graphics multiprocessor to perform a floating-point operation on the second set of processing cores; partitioning a cache memory of the graphics processing unit into multiple cache regions, wherein the cache memory is a graphics processor cache memory that is coupled with the first set of processing cores and the second set of processing cores, and the multiple cache regions include a first cache region having a cache eviction policy with a configurable level of data persistence; caching data associated with the first instruction, second instruction, or third instruction within the first cache region of the cache memory based on a workload associated with the respective instruction; and configuring the first cache region to a specified level of data persistence in response to an input including the specified level of data persistence. 9. The method as in claim 8 , wherein the input includes a hint associated with a fourth instruction. 10. The method as in claim 8 , wherein the cache memory is a level 2 (L2) cache memory. 11. The method as in claim 8 , wherein the multiple cache regions include a second cache region having a default cache eviction policy. 12. The method as in claim 11 , further comprising configuring the first cache region to cache data associated with a first region of memory and configuring the second cache region to cache data associated with a second region of memory, the first cache region configured in response to the input including the specified level of data persistence. 13. The method as in claim 8 , further comprising: executing the second instruction via first circuitry of the second set of processing cores that is configured to perform integer operations; executing the third instruction via second circuitry of the second set of processing cores that is configured to perform 32-bit floating-point operations; and executing a fourth instruction via third circuitry of the second set of processing cores that is configured to perform 64-bit floating-point operations. 14. A data processing system comprising: a memory device; and a graphics processing unit (GPU) coupled with the memory device, the GPU including a graphics multiprocessor, the graphics multiprocessor comprising: a plurality of processing resources including a first set of processing cores and a second set of processing cores, wherein the first set of processing cores includes dedicated tensor processing circuitry perform a plurality of matrix operations in response to an instruction and the second set of processing cores includes circuitry to execute instructions to perform integer and floating-point operations; and a cache memory coupled with the plurality of processing resources, wherein the cache memory is a graphics processor cache memory that is configured to be partitioned into multiple cache regions, the multiple cache regions include a first cache region having a cache eviction policy with a configurable level of data persistence, and the first cache region is configurable to a specified level of data persistence in response to an input including the specified level of data persistence. 15. The data processing system as in claim 14 , wherein the first cache region is configurable to the specified level of data persistence input includes a hint associated with an instruction. 16. The data processing system as in claim 14 , wherein the cache memory is configured to be partitioned based on workloads processed by the plurality of processing resources. 17. The data processing system as in claim 14 , wherein the cache memory is a level 2 (L2) cache memory. 18. The data processing system as in claim 14 , wherein the multiple cache regions include a second cache region having a default cache eviction policy. 19. The data processing system as in claim 18 , wherein the first cache region is associated with a first region of memory and the second cache region is associated with a second region of memory, the first cache region to be configured in response to the input including the specified level of data persistence. 20. The data processing system as in claim 14 , wherein the circuitry to execute instructions to perform integer and floating-point operations includes first circuitry to perform integer operations, second circuitry to perform 32-bit floating-point operations, and third circuitry to perform 64-bit floating-point operations. 21. The data processing system as in claim 14 , wherein the input includes an instruction. 22. The graphics processing unit as in claim 1 , wherein the input includes an instruction. 23. The method as in claim 8 , wherein the input includes a fourth instruction.

Assignees

Inventors

Classifications

  • Memory management · CPC title

  • In image processor or graphics adapter · CPC title

  • using clearing, invalidating or resetting means · CPC title

  • with dedicated cache, e.g. instruction or stack · CPC title

  • with special data handling, e.g. priority of data or instructions, handling errors or pinning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12056059B2 cover?
Systems and methods for cache utilization are disclosed. In one embodiment, a graphics processor includes processing resources to perform graphics operations and a cache controller of a cache memory that is coupled to the processing resources. The cache controller is configured to set an initial aging policy using an aging field based on age of cache lines within the cache memory and to determi…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06F12/123. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 06 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).