Method and apparatus to use DRAM as a cache for slow byte-addressible memory for efficient cloud applications
US-12174739-B2 · Dec 24, 2024 · US
US2016283374A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2016283374-A1 |
| Application number | US-201514668831-A |
| Country | US |
| Kind code | A1 |
| Filing date | Mar 25, 2015 |
| Priority date | Mar 25, 2015 |
| Publication date | Sep 29, 2016 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Resolving coherency issues inherent in sharing distributed cache is described. A chip multiprocessor may include at least first and second processing clusters, each having multiple cores of a processor, multiple cache slices co-located with the multiple cores, and a memory controller (MC). The processor stores directory information in a memory coupled to the processor to indicate cluster cache ownership of a first address space to the first cluster. In response to a request to change the cluster cache ownership of the first address space, the processor may remap first lines of first cache slices, corresponding to the first address space, to second lines in second cache slices of the second cluster, and update the directory information (e.g., a state of the first cache lines) to change the cluster cache ownership of the first address space to the second cluster. One of the MCs may manage such updating of the directory.
Opening claim text (preview).
What is claimed is: 1 . An apparatus comprising: a processor to execute computer instructions, wherein a first plurality of cores of the processor, a first memory controller, and a first plurality of cache slices co-located with the first plurality of cores are grouped into a first cluster, wherein a second plurality of cores of the processor, a second memory controller, and a second plurality of cache lines co-located with the second plurality of cores are grouped into a second cluster; wherein the processor is to: store directory information in a memory coupled to the processor, the directory information to indicate cluster cache ownership of a first address space to the first cluster; in response to a request to change the cluster cache ownership of the first address space, remap first lines of the first plurality of cache slices, corresponding to the first address space, to second lines in the second plurality of cache slices of the second cluster; and update the directory information to change the cluster cache ownership of the first address space to the second cluster. 2 . The apparatus of claim 1 , further comprising a single quick path interconnect caching agent associated with the first and second clusters, wherein the first and second clusters comprise non-uniform memory address clusters or central processing unit clusters. 3 . The apparatus of claim 1 , wherein one memory controller of the first and second memory controllers to receive the request to change the cluster cache ownership and remap the first lines of the first plurality of cache slices to the second lines in the second plurality of cache slices, the one memory controller further to update a state in the directory of the first lines that are remapped based on a change to an address hash function corresponding to the change of cluster cache ownership of the first address space. 4 . The apparatus of claim 3 , wherein the directory comprises a plurality of states comprising: a first state to indicate a line cached in a local cluster of a local socket, a second state to indicate a line could be cached in any cluster in any socket, and a third state to indicate that a line is cached in a cache slice of a physically-closest cluster. 5 . The apparatus of claim 4 , wherein the second cluster is the cluster physically closest to the first cluster, and wherein to update the states of the directory of the first lines, the one memory controller further to: receive a data request for the first lines at the second cluster due to the change to the address hash function; determine that a state of the first lines in the directory is the first state; snoop the first cluster for the first lines based on the first state, causing the first cluster to forward data from the first lines to be stored as the second lines in the second plurality of cache slices of the second cluster; and change the state for the first lines in the directory to the third state based on a response from the snoop regarding the forwarded data. 6 . The apparatus of claim 5 , wherein the one memory controller further to read the data from the second lines in the second plurality of cache slices of the second cluster to respond to the data request. 7 . The apparatus of claim 5 , wherein, after the first cluster has regained cluster cache ownership of the second lines, the one memory controller further to: receive a data request for the second lines at the first cluster due to a change in the hash function giving ownership of the second lines to the first cluster; determine that a state of the second lines in the directory is the third state; snoop the second cluster for the second lines based on the third state, causing the second cluster to forward data from the second lines to be stored as third lines in the first plurality of cache slices of the first cluster; update the state of the second lines in the directory to the first state based on a response to the snoop regarding the forwarded data; and read data from the third lines in the cache slices of the first cluster to respond to the data request. 8 . The apparatus of claim 1 , wherein the processor further to, before changing cluster cache ownership of the first lines to the second cluster: block new read or write requests to the first and second clusters; drain read or write requests issued on the first and second clusters; and remove the block on new read or write requests. 9 . The apparatus of claim 8 , wherein the processor further to flush stale memory locations for the first lines in the first plurality of cache slices, wherein the flush occurs outside of a quiesce period to change the cluster cache ownership. 10 . A method comprising: storing directory information in a directory stored in memory, the directory information to track cluster cache ownership of cache slices of at least first and second clusters, the first and second clusters each comprising a plurality of cores of a processor, a plurality of cache slices co-located with the plurality of cores, and a memory controller; remapping first lines of a first plurality of cache slices of the first cluster to second lines in a second plurality of cache slices of the second cluster in response to a request to change the cluster cache ownership of a first address space of the first cluster to a second address space of the second cluster; and updating the directory information to change the cluster cache ownership of the first address space to the second cluster. 11 . The method of claim 10 , further comprising updating a state in the directory of the first lines that are remapped based on a change to an address hash function corresponding to the change of cluster cache ownership of the first address space. 12 . The method of claim 11 , wherein the directory comprises a plurality of states comprising: a first state to indicate a line cached in a local cluster of a local socket, a second state to indicate a line could be cached in any cluster in any socket, and a third state to indicate that a line is cached in a cache slice of a physically-closest cluster. 13 . The method of claim 12 , wherein the second cluster is the cluster physically closest to the first cluster, and wherein updating the states of the directory of the first lines comprises: receiving a data request for the first lines at the second cluster due to the change to the address hash function; determining that a state of the first lines in the directory is the first state; snooping the first cluster for the first lines based on the first state, causing the first cluster to forward data from the first lines to be stored as the second lines in the second plurality of cache slices of the second cluster; and changing the state for the first lines in the directory to the third state based on a response from the snoop regarding the forwarded data. 14 . The method of claim 13 , further comprising reading the data from the second lines in the second plurality of cache slices of the second cluster to respond to the data request. 15 . The method of claim 13 , wherein, after the first cluster has regained cluster cache ownership of the second lines, the method further comprising: receiving a data request for the second lines at the first cluster due to a change in the hash function giving ownership of the second lines to the first cluster; determining that a state of the second lines in the directory is the third state; snooping the second cluster for the second lines based on the third state, causing the second cluster to forward data from the second lines to be stored
with a shared cache · CPC title
Non-uniform memory access [NUMA] architecture · CPC title
Non-uniform cache access [NUCA] architecture · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.