Configuration based cache coherency protocol selection
US-2016147658-A1 · May 26, 2016 · US
US9892043B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9892043-B2 |
| Application number | US-201715499591-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 27, 2017 |
| Priority date | Nov 20, 2014 |
| Publication date | Feb 13, 2018 |
| Grant date | Feb 13, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A computer system comprising multiple nodes, each node comprising a plurality of processors and a local cache hierarchy, suppresses local cache coherency of a node operations or global cache coherency operations between nodes based on the coherency request being a global or local request, and the state of the cache line at the node.
Opening claim text (preview).
What is claimed is: 1. A computer implemented method for maintaining cache coherency in a processor system comprising a first node, the first node comprising a first plurality of central processor (CP) clusters, each CP cluster comprising a second plurality of processors, each CP cluster having a respective local cache shared by processors of a respective CP cluster, the processor system configured to perform a multi-tiered cache coherency protocol, the multi-tiered cache coherency protocol consisting of any one of local cache coherency operations within a node and global cache coherency operations between nodes, wherein cache coherency operations consist of any one of finding a cache line, obtaining the cache line and updating coherency state of the cache line, the computer implemented method comprising: for each cache line installed in a CP cluster of a node, designating one CP cluster of the node as a local intervention master (LIM) CP cluster for a respective cache line, the LIM CP cluster of the respective cache line being the CP cluster of the node having a most recently installed copy of the respective cache line, wherein the LIM CP cluster of the respective cache line is a designated point of coherency for the respective cache line; initiating a first cache request for access to a first cache line, by a first processor of a first CP cluster of the first node; based on the first cache line being not available in a local cache of the first CP cluster, broadcasting the first cache request, over a local fabric, to other CP clusters, the other CP clusters comprising a local intervention master (LIM) CP cluster of the first cache line; based on the first cache request, performing a first cache coherency operation, between the first CP cluster and a designated LIM CP cluster of the first cache line; based on the first cache coherency operation causing the first cache line to be installed in a local cache of the first CP cluster, designating, by the processor system, the first CP cluster as LIM CP cluster of the first cache line; for each cache line installed in a node, designating one node as a global intervention master (GIM) node for a respective cache line, the GIM node of a respective cache line being the node having a most recently installed copy of the respective cache line, wherein the GIM node of the respective cache line is a global point of coherency for the respective cache line in a global fabric; based on the first cache line not existing in any cache of the first node, broadcasting, by a first storage control (SC) function of the first node, the first cache request over a global fabric to other SC functions of other nodes, the other SC functions comprising a GIM SC function; based on a second SC function of a second node being the GIM SC function of the first cache line, sending, by the second SC function, the broadcast first cache request to the LIM CP cluster of the first cache line of the second node; and based on the second SC function not being the GIM SC function, not sending the broadcast first cache request to any CP cluster of the second node. 2. The computer implemented method according to claim 1 , further comprising: receiving from a third CP cluster, by the first CP cluster, a second cache request for a second cache line; based on the first CP cluster being a LIM CP cluster of the second cache line, sending, by the first CP cluster, the second cache line to the third CP cluster; and based on the sending the second cache line to the third CP cluster, causing the third CP cluster to be the LIM CP cluster of the second cache line. 3. The computer implemented method according to claim 1 , wherein each SC function comprises an all-inclusive directory and a fabric control interface (FCI) function, the all-inclusive directory having an indication of all valid cache lines of a respective node, wherein coherency operations utilize the SC function, wherein each SC function is configured to be a GIM of a cache of the respective node. 4. The computer implemented method according to claim 3 , wherein the all-inclusive directory of the respective node is configured to identify all cache lines of all caches of the respective node, wherein the first node of the processor system determines whether a requested cache line is held in caches of the first node by interrogating only the all-inclusive directory of the SC function of the first node and not interrogating respective caches of CP clusters of the first node. 5. The computer implemented method according to claim 4 , wherein the all-inclusive directory of a node is interrogated to determine whether to provide a cache request from another node to CP clusters of the node. 6. The computer implemented method according to claim 1 , wherein the first cache request is broadcast to other CP clusters and to the SC function to locate a LIM CP of the cache line, wherein the SC function of a node is the point of coherency for all local fabric requests of the node, the method further comprising: rejecting by the SC function of the first node, a local fabric operation for the cache line, from a processor of the first node, based on a global fabric operation for the same cache line of the first node being active from another node. 7. The computer implemented method according to claim 1 , wherein the first node comprises one or more CP clusters and an SC function interconnected by a local fabric, wherein cache requests from a CP cluster of the first node are broadcast to all other CP clusters of the first node and the SC function of the first node, wherein, in response to the broadcast cache request, each of said all other CP clusters of the first node send respective partial responses to other CP clusters and the SC function of the first node, and wherein, in response to the broadcast cache request, the SC function sends respective partial responses to the one or more CP clusters of the first node, wherein a global cache request is sent from the SC function of the first node to all other SC functions of other nodes, and wherein each of said all other SC functions respond with a partial response to the SC function of the first node to indicate whether they have the cache line without interrogating caches of CP clusters of the other nodes. 8. A computer system for maintaining cache coherency, the computer system comprising: a first node comprising a first plurality of processor (CP) clusters, each CP cluster comprising a second plurality of processors, each CP cluster having a respective local cache shared by processors of a respective CP cluster, the processor system configured to perform a multi-tiered cache coherency protocol, the multi-tiered cache coherency protocol consisting of any one of local cache coherency operations within a node and global cache coherency operations between nodes, wherein cache coherency operations consist of any one of finding a cache line, obtaining the cache line and updating coherency state of the cache line, the computer system configured to perform a method comprising: for each cache line installed in a CP cluster of a node, designating one CP cluster of the node as a local intervention master (LIM) CP cluster for a respective cache line, the LIM CP cluster of the respective cache line being the CP cluster of the node having a most recently installed copy of the respective cache line, wherein a LIM CP cluster of the respective cache line is a designated point of coherency for the respective cache line; initiating a first cache request for access to a first cache line, by a first processor of a first CP cluster of the first node; based on the first cache line being not available in a local cache of the first CP cluster, broadcasting the first cache request, over a local fabr
Coherency control relating to peripheral accessing, e.g. from DMA or I/O device · CPC title
with multilevel cache hierarchies · CPC title
Cache consistency protocols · CPC title
Details of cache memory · CPC title
Multiuser, multiprocessor or multiprocessing cache systems · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.