Private memory table for reduced memory coherence traffic
US-9760490-B2 · Sep 12, 2017 · US
US11294809B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11294809-B2 |
| Application number | US-201816115067-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 28, 2018 |
| Priority date | Dec 12, 2016 |
| Publication date | Apr 5, 2022 |
| Grant date | Apr 5, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Embodiments of an invention a processor architecture are disclosed. In an embodiment, a processor includes a decoder, an execution unit, a coherent cache, and an interconnect. The decoder is to decode an instruction to zero a cache line. The execution unit is to issue a write command to initiate a cache line sized write of zeros. The coherent cache is to receive the write command, to determine whether there is a hit in the coherent cache and whether a cache coherency protocol state of the hit cache line is a modified state or an exclusive state, to configure a cache line to indicate all zeros, and to issue the write command toward the interconnect. The interconnect is to, responsive to receipt of the write command, issue a snoop to each of a plurality of other coherent caches for which it must be determined if there is a hit.
Opening claim text (preview).
What is claimed is: 1. A processor comprising: a first coherent agent, coupled to an interconnect through a first cache, to decode and execute an instruction whose execution is to cause a cache line sized write of zeros at a memory address and to issue, to the first cache, a write command to initiate the cache line sized write of zeros at the memory address; the first cache, when there is a hit for a cache line responsive to receiving the write command and that cache line's cache coherency protocol state is a modified state or an exclusive state, to configure that cache line to indicate all zeros, and, when there is a cache miss responsive to receiving to the write command, is to issue the write command toward the interconnect; and a second coherent agent coupled to the interconnect through a second cache; the interconnect, responsive to receiving the write command, to issue a snoop to the second cache; wherein the interconnect, or the first coherent agent responsive to a message from the interconnect, is to cause an other cache line in the first cache to be configured to indicate all zeros when the write command and the snoop did not cause the cache line write of zeros to be performed. 2. The processor of claim 1 , wherein the first cache, when there is a hit for the cache line responsive to receiving the write command and that cache line's cache coherency protocol state is not the modified state or the exclusive state, is to make that cache line's cache coherency protocol state be an invalid state and to issue the write command toward the interconnect. 3. A processor comprising: a first core to issue a write command responsive to execution of a cache line zeroing instruction, the first core also comprising a level 1 (L1) cache; the L1 cache coupled to receive the write command, to determine whether there is a hit or a miss in the L1 cache responsive to the write command, to determine responsive to the hit whether a cache coherency protocol state of a cache line that hit is one that grants the L1 cache authority to modify that cache line without a broadcast to at least one other cache, to configure that cache line to indicate all zeros responsive to the hit when the cache coherency protocol state of the cache line that hit is one that grants the L1 cache authority to modify the first cache line without the broadcast to the at least one other cache, and to issue the write command toward an interconnect responsive to the miss; and the interconnect, coupled to the first core, to issue, responsive to the write command, a snoop to those of the at least one other caches for which it must be determined if there is an other hit; wherein the first core, the interconnect, or the first core responsive to a message from the interconnect, is to cause an other cache line in the cache or one of the at least one other caches to be configured to indicate all zeros when the write command and the snoop did not cause the cache line write of zeros to be performed. 4. The processor of claim 3 , wherein the first cache is to make the cache coherency protocol state be invalid and issue the write command toward the interconnect responsive to the hit when the cache coherency protocol state of the cache line that hit is not one that grants the cache authority to modify the first cache line without the broadcast to the least one other cache. 5. The processor of claim 3 , wherein each of the at least one other caches, responsive to the snoop, is to determine whether there is the other hit or an other miss in that other cache, and to determine responsive to the other hit whether a cache coherency protocol state of the cache line that hit in that other cache is one that grants that other cache authority to modify that cache line without a broadcast to other caches. 6. The processor of claim 5 , wherein each of the at least one other caches is to: configure the cache line in that other cache to indicate all zeros and issue a response message indicating zeroed responsive to the other hit when the cache coherency protocol state of the cache line that hit in that other cache is one that grants that other cache authority to modify the cache line that hit in that cache without a broadcast to other caches; and issue a response message indicating not zeroed responsive to the miss or responsive to the other hit when the cache coherency protocol state of the cache line that hit in that other cache is not one that grants the cache authority to modify the cache line that hit in that other cache. 7. The processor of claim 6 , wherein the interconnect is to track receipt of the response messages to determine if the snoop caused one of the other caches to be configured to indicate all zeros.
LOAD or STORE instructions; Clear instruction · CPC title
using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB] · CPC title
Performance improvement · CPC title
using a bus scheme, e.g. with bus monitoring or watching means · CPC title
Energy efficient computing, e.g. low power processors, power management or thermal management · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.