Cache memory control in electronic device
US-2015261683-A1 · Sep 17, 2015 · US
US12572997B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12572997-B2 |
| Application number | US-202318305904-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 24, 2023 |
| Priority date | Nov 15, 2019 |
| Publication date | Mar 10, 2026 |
| Grant date | Mar 10, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Embodiments described herein are generally directed to improvements relating to power, latency, bandwidth and/or performance issues relating to GPU processing/caching. According to one embodiment, a state of multiple intellectual property (IP) cores that have access to a common cache via a central fabric is observed. Responsive to the observed state being indicative of performance of a standalone workload by a first IP core of the multiple IP cores, the common cache is treated as a local cache of the first IP core by powering off the central fabric and causing the first IP core to access the common cache via a low power access path between the first IP core and the common cache that is outside of the central fabric.
Opening claim text (preview).
What is claimed is: 1 . A graphics processing unit (GPU) comprising: a fabric; a cache coupled to the fabric; a plurality of intellectual property (IP) cores coupled to the cache via the fabric; a low-power access path outside of the fabric coupling a first IP core of the plurality of IP cores to the cache; and wherein the fabric is operable to be selectively powered off or powered on depending upon (i) a status of each of the plurality of IP cores or (ii) a workload being processed by the first IP core. 2 . The GPU of claim 1 , the fabric is powered off when the status of the first IP core is active and the status of all other of the plurality of IP cores is inactive or when the workload comprises a standalone workload that does not involve communication between the first IP core and any other of the plurality of IP cores. 3 . The GPU of claim 2 , wherein first IP core comprises a media IP core. 4 . The GPU of claim 3 , wherein the standalone workload comprises media decoding. 5 . The GPU of claim 3 , wherein the standalone workload comprises media encoding. 6 . The GPU of claim 3 , wherein the standalone workload comprises media transcoding. 7 . A method comprising: determining, by a graphics processing unit (GPU), a state of a plurality of intellectual property (IP) cores that have access to a common cache via a central fabric, wherein the state is indicative of performance of a standalone workload by a first IP core of the plurality of IP cores; and treating the common cache as a local cache of the first IP core by powering off the central fabric and causing the first IP core to access the common cache via a low-power access path between the first IP core and the common cache, wherein the low-power access path is outside of the central fabric. 8 . The method of claim 7 , wherein the standalone workload does not involve communication between the first IP core and any other of the plurality of IP cores. 9 . The method of claim 7 , wherein the first IP core comprises a media IP core and wherein the standalone workload comprises media decoding, media encoding or media transcoding. 10 . The method of claim 7 , wherein the state comprises the first IP core being active and all other IP cores of the plurality of IP cores being inactive. 11 . A graphics processing unit (GPU) comprising: a plurality of cache banks; a plurality of shader modules coupled to the plurality of cache banks; and wherein each cache bank of the plurality of cache banks is reconfigurable to operate as part of a global last level cache or as a local cache for a particular shader module of the plurality of shader modules based on at least one of a workload demand on the particular shader module and a distance between the cache bank and the particular shader module. 12 . The GPU of claim 11 , wherein the GPU comprises a die-stacked GPU including one or more top dies and a base die. 13 . The GPU of claim 12 , wherein the one or more top dies include a plurality of chiplets containing the shader modules. 14 . The GPU of claim 12 , wherein the base die includes the plurality of cache banks. 15 . The GPU of claim 11 , wherein the plurality of cache banks comprise level 2 (L2) cache banks. 16 . The GPU of claim 11 , wherein the plurality of cache banks comprise level 3 (L3) cache banks.
Inference or reasoning models · CPC title
Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches · CPC title
Local memory within processor subsystem · CPC title
Processor architectures; Processor configuration, e.g. pipelining · CPC title
Machine learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.