System cache optimizations for deep learning compute engines
US-2021349835-A1 · Nov 11, 2021 · US
US11914525B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11914525-B2 |
| Application number | US-202318168703-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 14, 2023 |
| Priority date | Apr 24, 2017 |
| Publication date | Feb 27, 2024 |
| Grant date | Feb 27, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In an example, an apparatus comprises a plurality of compute engines; and logic, at least partially including hardware logic, to detect a cache line conflict in a last-level cache (LLC) communicatively coupled to the plurality of compute engines; and implement context-based eviction policy to determine a cache way in the cache to evict in order to resolve the cache line conflict. Other embodiments are also disclosed and claimed.
Opening claim text (preview).
The invention claimed is: 1. An apparatus comprising: deep learning (DL) hardware circuitry communicatively coupled to a last-level cache (LLC) by an interconnect, wherein the DL hardware circuitry to execute one or more layers of a DL network using the LLC; and a system cache controller communicably coupled to the DL hardware circuitry and the LLC, the system cache controller to: receive, from the DL hardware circuitry, a cache access request; responsive to a cache line conflict in the LLC for the cache access request, implement a context-based eviction policy to determine a cache way in the LLC to evict in order to resolve the cache line conflict; and responsive to a cache miss for the cache access request, select a variable cache line size to allocate and fill from the LLC, wherein the variable cache line size is selected based on a total size of data indicated in metadata received with the cache access request and based on data utilization of the LLC by the DL network. 2. The apparatus of claim 1 , wherein the cache access request comprises a context identifier (ID) corresponding to the DL hardware circuitry that originated the cache access request and the metadata that indicates the total size of data to be accessed in one or more subsequent data access transactions; and wherein the system cache controller is further to: assign one or more of the DL hardware circuitry as clients of the LLC; and assign the context ID to the clients of the LLC. 3. The apparatus of claim 2 , wherein the context-based eviction policy is a function of the context identifier. 4. The apparatus of claim 2 , wherein the LLC may be reconfigured dynamically into a plurality of individually addressable caches. 5. The apparatus of claim 4 , wherein the LLC may be reconfigured with a variable cache size. 6. A system comprising: a last-level cache (LLC); a processor having deep learning (DL) hardware circuitry communicatively coupled to the LLC by an interconnect, wherein the DL hardware circuitry execute one or more layers of a DL network using the LLC; and a system cache controller communicably coupled to the DL hardware circuitry and the LLC, the system cache controller to: receive, from the DL hardware circuitry, a cache access request; responsive to a cache line conflict in the LLC for the cache access request, implement a context-based eviction policy to determine a cache way in the LLC to evict in order to resolve the cache line conflict; and responsive to a cache miss for the cache access request, select a variable cache line size to allocate and fill from the LLC, wherein the variable cache line size is selected based on a total size of data indicated in metadata received with the cache access request and based on data utilization of the LLC by the DL network. 7. The system of claim 6 , wherein the cache access request comprises a context identifier (ID) corresponding to the DL hardware circuitry that originated the cache access request and the metadata that indicates the total size of data to be accessed in one or more subsequent data access transactions; and wherein the system cache controller is further to: assign one or more of the DL hardware circuitry as clients of the LLC; and assign the context ID to the clients of the LLC. 8. The system of claim 7 , wherein the Context-based eviction policy is a function of the context identifier. 9. The system of claim 7 , wherein the LLC may be reconfigured dynamically into a plurality of individually addressable caches. 10. The system of claim 9 , wherein the LLC may be reconfigured with a variable cache size. 11. A method comprising: receive, from deep learning (DL) hardware circuitry communicatively coupled to a last-level cache (LLC) by an interconnect, a plurality of cache access requests; responsive to a cache line conflict in the LLC for a cache access request of the plurality of cache access requests, implementing a context-based eviction policy to determine a cache way in the LLC to evict in order to resolve the cache line conflict; and responsive to a cache miss for the cache access request, selecting a variable cache line size to allocate and fill from the LLC, wherein the variable cache line size is selected based on a total size of data indicated in metadata received with the cache access request and based on data utilization of the LLC by a DL network. 12. The method of claim 11 , wherein the cache access request comprises a context identifier (ID) corresponding to the DL hardware circuitry that originated the cache access request and the metadata that indicates the total size of data to be accessed in one or more subsequent data access transactions. 13. The method of claim 12 , further comprising: assigning one or more of the DL hardware circuitry as clients of the LLC; and assigning the context ID to the clients of the LLC. 14. The method of claim 12 , wherein the context-based eviction policy is a function of the context identifier. 15. The method of claim 12 , wherein the LLC may be reconfigured dynamically into a plurality of individually addressable caches. 16. The method of claim 15 , wherein the LLC may be reconfigured with a variable cache size. 17. A non-transitory computer-readable medium comprising one or more instructions that when executed on at least one processor configure the at least one processor to perform one or more operations to: receive, from a deep learning (DL) hardware circuitry communicatively coupled to a last-level cache (LLC) by an interconnect, a plurality of cache access requests; responsive to a cache line conflict in the LLC for a cache access request of the plurality of cache access requests, implementing a context-based eviction policy to determine a cache way in the LLC to evict in order to resolve the cache line conflict; and responsive to a cache miss for the cache access request, selecting a variable cache line size to allocate and fill from the LLC, wherein the variable cache line size is selected based on a total size of data indicated in metadata received with the cache access request and based on data utilization of the LLC by a DL network. 18. The non-transitory computer-readable medium of claim 17 , wherein the cache access request comprises a context identifier (ID) corresponding to the DL hardware circuitry that originated the cache access request and the metadata that indicates the total size of data to be accessed in one or more subsequent data access transactions. 19. The non-transitory computer-readable medium of claim 18 , the one or more operations are further to: assign one or more of the DL hardware circuitry as clients of the LLC; and assign the context ID to the clients of the LLC. 20. The non-transitory computer-readable medium of claim 18 , wherein the context-based eviction policy is a function of the context identifier. 21. The non-transitory computer-readable medium of claim 18 , wherein the LLC may be reconfigured dynamically into a plurality of individually addressable caches. 22. The non-transitory computer-readable medium of claim 18 , wherein the LLC may be reconfigured with a variable cache size.
Supervised learning · CPC title
Distributed learning, e.g. federated learning · CPC title
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
adapted to multidimensional cache systems, e.g. set-associative, multicache, multiset or multilevel · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.