Adaptive coherence for latency-bandwidth tradeoffs in emerging memory technologies
US-2019042429-A1 · Feb 7, 2019 · US
US11669454B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11669454-B2 |
| Application number | US-201916405691-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 7, 2019 |
| Priority date | May 7, 2019 |
| Publication date | Jun 6, 2023 |
| Grant date | Jun 6, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A processor includes one or more cores having cache, a cache home agent (CHA), a near memory controller, to near memory, and a far memory controller, which is to: receive a first memory read operation from the CHA directed at a memory address; detect a miss for the first memory address at the near memory; issue a second memory read operation to the far memory controller to retrieve a cache line, having first data, from the memory address of far memory; receive the cache line from the far memory controller in response to the second memory read operation; and send the cache line to the CHA with a forced change to a directory state of the cache line at the CHA, the forced change to cause the CHA to snoop remote sockets to maintain data coherence for the cache line in an absence of directory state in the far memory.
Opening claim text (preview).
What is claimed is: 1. A processor comprising: one or more cores, each comprising cache; a cache home agent (CHA) coupled to the cache; and a near memory controller coupled to the CHA, to near memory, and to a far memory controller, wherein the near memory controller is to: receive a first memory read operation from the CHA directed at a memory address; detect a miss for the memory address at the near memory; issue, in response to detection of the miss, a second memory read operation to the far memory controller to retrieve a cache line, comprising first data, from the memory address of far memory; receive the cache line from the far memory controller in response to completion of the second memory read operation; and in response to the cache line being retrieved from the far memory, send the cache line to the CHA with a change to a directory state of the cache line at the CHA, the changed directory state to cause the CHA to snoop remote sockets to maintain data coherence for the cache line, and write the cache line to the near memory with a set of metadata bits indicating that the cache line is clean and directory bits in the cache line are dirty, wherein based on the set of metadata bits, the cache line is not written to the far memory in response to the cache line being selected to be evicted from the near memory, wherein the near memory maintains the data coherence using directory state in an absence of directory state in the far memory. 2. The processor of claim 1 , further comprising the far memory controller to: retrieve the cache line from the far memory in response to the second memory read operation; and send the cache line to the near memory controller. 3. The processor of claim 1 , wherein the changed directory state comprises an “any” (A) state, and wherein the CHA is to snoop the remote sockets for the cache line to maintain data coherence for the cache line. 4. The processor of claim 1 , wherein the near memory controller is further to: update the directory bits in the cache line to a directory state consistent with a read opcode of the first memory read operation. 5. The processor of claim 4 , wherein the cache line written to the near memory further comprises the set of metadata bits comprising: a first bit to indicate whether the first data in the cache line is dirty; and a second bit to indicate whether the directory bits are dirty. 6. The processor of claim 5 , wherein, to evict the cache line, the near memory controller is further to: determine that the first bit of the set of metadata bits indicates that the first data is clean; and not issue a memory write operation to write the first data back to the far memory. 7. The processor of claim 5 , wherein the near memory controller is further to: receive a memory write operation from the CHA directed to the memory address; determine whether the memory write operation is a directory-only write or includes a data write; and in response to being a directory-only write, set the second bit, but not the first bit, of the set of metadata bits, which avoids a write-back to the far memory. 8. The processor of claim 7 , wherein, to determine that the memory write operation is the directory-only write, the near memory controller is to determine that an update to the directory bits is necessary based on a current directory state read from the near memory and on an opcode received in the first memory read operation, and wherein the near memory controller is further to clear the second bit of the set of metadata bits to indicate the first data is clean. 9. The processor of claim 7 , wherein, to determine that the memory write operation is the directory-only write, the near memory controller is to receive a direct indication of the directory-only write in the memory write operation from the CHA, wherein the direct indication is based in part on a previous indication received by the CHA from the near memory controller, in response to the first memory read operation, that the first bit indicated the first data was clean. 10. A method comprising: receiving, by a near memory controller of a processing system, a first memory read operation from a cache home agent (CHA) directed at a memory address; detecting, by the near memory controller, a miss for the memory address at near memory of a two-level memory system; issuing, by the near memory controller in response to detection of the miss, a second memory read operation to a far memory controller to retrieve a cache line, comprising first data, from the memory address of far memory; receiving, by the near memory controller, the cache line from the far memory controller in response to completion of the second memory read operation; and in response to the cache line being retrieved from the far memory, sending, by the near memory controller, the cache line to the CHA with a change to a directory state of the cache line at the CHA, the changed directory state to cause the CHA to snoop remote sockets to maintain data coherence for the cache line, and write the cache line to the near memory with a set of metadata bits indicating that the cache line is clean and directory bits in the cache line are dirty, wherein based on the set of metadata bits, the cache line is not written to the far memory in response to the cache line being selected to be evicted from the near memory, wherein the near memory maintains the data coherence using directory state in an absence of directory state in the far memory. 11. The method of claim 10 , wherein the changed directory state comprises an “any” (A) state, the method further comprising: retrieving, by the far memory controller, the cache line from the far memory in response to the second memory read operation; sending, by the far memory controller, the cache line to the near memory controller; and snooping, by the CHA, the remote sockets for the cache line to maintain data coherence for the cache line. 12. The method of claim 10 , further comprising: updating, by the near memory controller, the directory bits in the cache line to a directory state consistent with a read opcode of the first memory read operation. 13. The method of claim 12 , wherein the cache line written to the near memory further comprises the set of metadata bits comprising: a first bit to indicate whether the first data in the cache line is dirty; and a second bit to indicate whether the directory bits are dirty. 14. The method of claim 13 , further comprising evicting the cache line, wherein evicting comprises: determining, by the near memory controller, that the first bit of the set of metadata bits indicates that the first data is clean; and not issuing, by the near memory controller, a memory write operation to write the first data back to the far memory. 15. The method of claim 13 , further comprising: receiving, by the near memory controller, a memory write operation from the CHA directed to the memory address; determining, by the near memory controller, whether the memory write operation is a directory-only write or includes a data write; and in response to being a directory-only write, setting, by the near memory controller, the second bit, but not the first bit, of the set of metadata bits, which avoids a write-back to the far memory. 16. The method of claim 15 , wherein determining that the memory write operation is the directory-only write comprises determining, by the near memory controller, that an update to the directory bits is necessary based on a current directory state read from the near memory and on an opcode received in the first memory read operation, the met
Performance improvement · CPC title
Hybrid memory, e.g. using both volatile and non-volatile memory · CPC title
Copy directories (local copy tags for implementing a bus snooping protocol G06F12/0831) · CPC title
for multiprocessing or multitasking · CPC title
with cache invalidating means (G06F12/0815 takes precedence) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.