Private memory table for reduced memory coherence traffic
US-9760490-B2 · Sep 12, 2017 · US
US10282296B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10282296-B2 |
| Application number | US-201615376647-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 12, 2016 |
| Priority date | Dec 12, 2016 |
| Publication date | May 7, 2019 |
| Grant date | May 7, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Embodiments of an invention a processor architecture are disclosed. In an embodiment, a processor includes a decoder, an execution unit, a coherent cache, and an interconnect. The decoder is to decode an instruction to zero a cache line. The execution unit is to issue a write command to initiate a cache line sized write of zeros. The coherent cache is to receive the write command, to determine whether there is a hit in the coherent cache and whether a cache coherency protocol state of the hit cache line is a modified state or an exclusive state, to configure a cache line to indicate all zeros, and to issue the write command toward the interconnect. The interconnect is to, responsive to receipt of the write command, issue a snoop to each of a plurality of other coherent caches for which it must be determined if there is a hit.
Opening claim text (preview).
What is claimed is: 1. A processor comprising: a decoder to decode an instruction to zero a cache line; an execution unit, coupled to the decoder and responsive to the decode of the instruction, to issue a write command to initiate a cache line sized write of zeros at a memory address; a coherent cache, coupled to the execution unit, to receive the write command, to determine whether there is a hit in the coherent cache responsive to the write command, to determine whether a cache coherency protocol state of the hit cache line is a modified state or an exclusive state, to configure a cache line to indicate all zeros when the cache coherency protocol state is the modified state or the exclusive state, and to issue the write command toward an interconnect when there is a miss responsive receiving to the write command; the interconnect, responsive to receipt of the write command, to issue a snoop to each of a plurality of other coherent caches for which it must be determined if there is a hit, wherein the interconnect, or the execution unit responsive to a message from the interconnect, to cause a cache line in one of the coherent caches to be configured to indicate all zeros when the write command and the snoop did not cause the cache line sized write of zeros to be performed. 2. The processor of claim 1 , wherein the coherent cache is also to make that cache line's cache coherency protocol state be an invalid state and issue the write command toward the interconnect when the cache coherency protocol state of the hit cache line is not the modified state or the exclusive state. 3. The processor of claim 1 , wherein the decoder and the execution unit are part of a first core, and wherein the plurality of coherent caches includes a coherent cache of a second core. 4. The processor of claim 1 , wherein the cache line is to be configured to indicate all zeros by writing over data in the cache line with zeros. 5. The processor of claim 1 , wherein the cache line is to be configured to indicate all zeros by invalidating the cache line in a way of the cache in which that cache line currently resides and write a cache line of zeros into a different way of the cache. 6. The processor of claim 1 , wherein the cache line is to be configured to indicate all zeros by changing a tag state rather than data. 7. The processor of claim 1 , wherein the cache line is to be triggered to be configured to indicate all zeros by sending of a zero line to an intermediate buffer. 8. The processor of claim 1 , wherein the cache line is to be triggered to be configured to indicate all zeros by the write command indicating a cache line sized write of zeros but the write command does not carry a cache line of zeros. 9. The processor of claim 1 , wherein the cache line is to be triggered to be configured to indicate all zeros by writing of chunks of zeros smaller than the cache line to an intermediate buffer and concurrently writing the chunks to the cache line. 10. The processor of claim 1 , wherein execution of the instruction does not include a request for ownership operation. 11. The processor of claim 1 , wherein execution of the instruction does not include a return of data from the cache line. 12. The processor of claim 1 , wherein the instruction indicates a cache line size through a value in a register. 13. The processor of claim 1 , wherein the instruction has a format including a field to indicate a cache line size. 14. The processor of claim 1 , wherein a size parameter is to be associated with the instruction to indicate a multiple of a cache line size. 15. The processor of claim 1 , wherein a size parameter is to be associated with the instruction to indicate a number of bytes. 16. The processor of claim 1 , wherein execution of the instruction is to be atomic. 17. The processor of claim 16 , wherein execution of the instruction is to be auto-evicting. 18. The processor of claim 1 , wherein execution of the instruction is to be weakly ordered. 19. The processor of claim 1 , wherein execution of the instruction is to be strongly ordered. 20. The processor of claim 1 , wherein execution of the instruction is to cause a fault in response to an un-writable result of page table walk. 21. A processor comprising: a decoder to decode an instruction to zero a cache line; an execution unit, coupled to the decoder, to issue a command responsive to the decode of the instruction; an interconnect, responsive to receipt of the command, to issue a snoop to each of a plurality of coherent caches for which it must be determined if there is a hit, wherein the execution unit on its own, the interconnect, or the execution unit responsive to a message from the interconnect, to cause a cache line in one of the plurality of coherent caches coupled to the execution unit to be configured to indicate all zeros when the snoop did not cause the cache line write of zeros to be performed. 22. The processor of claim 21 , wherein the execution unit is to make a hit cache line's cache coherency protocol state be an invalid state when the cache coherency protocol state of the hit cache line is not the modified state or the exclusive state.
Coherency control relating to peripheral accessing, e.g. from DMA or I/O device · CPC title
using a bus scheme, e.g. with bus monitoring or watching means · CPC title
using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB] · CPC title
Details of translation look-aside buffer [TLB] · CPC title
using page tables, e.g. page table structures · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.