Network-aware cache coherence protocol enhancement
US-2018143905-A1 · May 24, 2018 · US
US10657056B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10657056-B2 |
| Application number | US-201816024773-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 30, 2018 |
| Priority date | Jun 30, 2018 |
| Publication date | May 19, 2020 |
| Grant date | May 19, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Technologies for demoting cache lines to a shared cache include a compute device with at least one processor having multiple cores, a cache memory with a core-local cache and a shared cache, and a cache line demote device. A processor core of a processor of the compute device is configured to retrieve at least a portion of data of a received network packet and move the data into one or more core-local cache lines of the core-local cache. The processor core is further configured to perform a processing operation on the data and transmit a cache line demotion command to the cache line demote device subsequent to having completed the processing operation. The cache line demote device is configured to perform a cache line demotion operation to demote the data from the core-local cache lines to shared cache lines of the shared cache. Other embodiments are described herein.
Opening claim text (preview).
The invention claimed is: 1. A compute device for demoting cache lines to a shared cache, the compute device comprising: one or more processors, wherein each of the one or more processors includes a plurality of processor cores; a cache memory, wherein the cache memory includes a core-local cache and a shared cache, wherein the core-local cache includes a plurality of core-local cache lines, and wherein the shared cache includes a plurality of shared cache lines; a cache line demote device; and a host fabric interface (HFI) to receive a network packet, wherein a processor core of a processor of the one or more processors is to: retrieve at least a portion of data of the received network packet, wherein to retrieve the data comprises to move the data into one or more core-local cache lines of the plurality of core-local cache lines; perform one or more processing operations on the data; and transmit, subsequent to having completed the one or more processing operations on the data and in response to a determination by the processor core that a size of the received network packet is greater than a packet size threshold, a cache line demotion command to the cache line demote device, and wherein the cache line demote device is to perform, in response to having received the cache line demotion command, a cache line demotion operation to demote the data from the one or more core-local cache lines to one or more shared cache lines of the shared cache. 2. The compute device of claim 1 , wherein the processor core is further to transmit, subsequent to having determined that the size of the received network packet is less than or equal to the packet size threshold, a cache line demote instruction to a cache manager of the cache memory, and wherein the cache manager is to demote the data from the one or more core-local cache lines to the one or more shared cache lines of the shared cache based on the cache line demote instruction, wherein the cache line demote instruction bypasses the cache line demote device. 3. The compute device of claim 2 , wherein to transmit the cache line demotion instruction includes to transmit one or more cache line identifiers corresponding to the one or more shared cache lines. 4. The compute device of claim 1 , wherein to perform the cache line demotion operation comprises to perform a read request or a direct memory access. 5. The compute device of claim 1 , wherein the cache line demotion command includes an indication of the core-local cache lines associated with the received network packet that are to be demoted to the shared cache. 6. The compute device of claim 1 , wherein the cache line demote device comprises one of a copy engine, a direct memory access (DMA) device usable to copy data, or an offload device usable to perform a read operation. 7. The compute device of claim 1 , wherein to transmit the cache line demotion command includes to transmit one or more cache line identifiers corresponding to the one or more shared cache lines. 8. One or more machine-readable storage media comprising a plurality of instructions stored thereon that, in response to being executed, cause a compute device to: retrieve, by a processor of the compute device, at least a portion of data of a network packet received by a host fabric interface (HFI) of the compute device, wherein to retrieve the data comprises to move the data into one or more core-local cache lines of a plurality of core-local cache lines of a core-local cache of the compute device, and wherein the processor includes a plurality of processor cores; perform, by a processor core of the plurality of processor cores, one or more processing operations on the data; transmit, by the processor subsequent to having completed the one or more processing operations on the data and in response to a determination by the processor core that a size of the received network packet is greater than a packet size threshold, a cache line demotion command to a cache line demote device of the compute device; and perform, by the cache line demote device and in response to having received the cache line demotion command, a cache line demotion operation to demote the data from the one or more core-local cache lines to one or more shared cache lines of a shared cache of the compute device. 9. The one or more machine-readable storage media of claim 8 , wherein the processor core is further to transmit, subsequent to having determined that the size of the received network packet is less than or equal to the packet size threshold, a cache line demote instruction to a cache manager of a cache memory that includes the core-local cache and the shared cache, and wherein the cache manager is to demote the data from the one or more core-local cache lines to the one or more shared cache lines of the shared cache based on the cache line demote instruction. 10. The one or more machine-readable storage media of claim 9 , wherein to transmit the cache line demotion instruction includes to transmit one or more cache line identifiers corresponding to the one or more shared cache lines. 11. The one or more machine-readable storage media of claim 8 , wherein to perform the cache line demotion operation comprises to perform a read request or a direct memory access. 12. The one or more machine-readable storage media of claim 8 , wherein to transmit the cache line demotion command includes to transmit one or more cache line identifiers corresponding to the one or more shared cache lines. 13. A method for demoting cache lines to a shared cache, the method comprising: retrieving, by a processor of the compute device, at least a portion of data of a network packet received by a host fabric interface (HFI) of the compute device, wherein to retrieve the data comprises to move the data into one or more core-local cache lines of a plurality of core-local cache lines of a core-local cache of the compute device, and wherein the processor includes a plurality of processor cores; performing, by a processor core of the plurality of processor cores, one or more processing operations on the data; transmitting, by the processor core and subsequent to having completed the one or more processing operations on the data and in response to a determination by the processor core that a size of the received network packet is greater than a packet size threshold, a cache line demotion command to a cache line demote device of the compute device; and performing, by the cache line demote device and in response to having received the cache line demotion command, a cache line demotion operation to demote the data from the one or more core-local cache lines to one or more shared cache lines of a shared cache of the compute device. 14. The method of claim 13 , further comprising: transmitting, by the processor core and subsequent to having determined that the size of the received network packet is less than or equal to the packet size threshold, a cache line demote instruction to a cache manager of a cache memory that includes the core-local cache and the shared cache; and demoting, by the cache manager, the data from the one or more core-local cache lines to the one or more shared cache lines of the shared cache based on the cache line demote instruction. 15. The method of claim 14 , wherein transmitting the cache line demotion instruction includes transmitting one or more cache line identifiers corresponding to the one or more shared cache lines. 16. The method of claim 13 , wherein performing the cache line demotion operation comprises performing one of a read request or a direct memory
adapted to multidimensional cache systems, e.g. set-associative, multicache, multiset or multilevel · CPC title
State-only directory, i.e. not recording identity of sharing or owning nodes · CPC title
using burst mode transfer, e.g. direct memory access {DMA}, cycle steal (G06F13/32 takes precedence) · CPC title
Latency reduction · CPC title
Decentralised address translation, e.g. in distributed shared memory systems · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.