Disaggregating a memory side cache data array and cache controller
US-2024211400-A1 · Jun 27, 2024 · US
US9298621B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9298621-B2 |
| Application number | US-201113288996-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 4, 2011 |
| Priority date | Nov 4, 2011 |
| Publication date | Mar 29, 2016 |
| Grant date | Mar 29, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A chip multi-processor (CMP) with virtual domain management. The CMP has a plurality of tiles each including a core and a cache, a mapping storage, a plurality of memory controllers, a communication bus interconnecting the tiles and the memory controllers, and machine-executable instructions. The tiles and memory controllers are responsive to the instructions to group the tiles into a plurality of virtual domains, each virtual domain associated with at least one memory controller, and to store a mapping unique to each virtual domain in the mapping storage.
Opening claim text (preview).
We claim: 1. A chip multi-processor with virtual domain management comprising: a plurality of tiles formed in an integrated circuit, each tile comprising a core and a cache; a mapping storage; a plurality of memory controllers formed in the integrated circuit; a communication bus interconnecting the tiles and the memory controllers; and machine-executable instructions, the tiles and memory controllers responsive to the instructions to: group the tiles into a plurality of virtual domains, each virtual domain associated with at least one memory controller and including a dynamic home tile storing a directory, each virtual domain permitting each tile of the tiles within the virtual domain to access data stored in the cache of any other tile of the tiles of the virtual domain without having to employ directory hardware global to the tiles of the chip multi-processor by referencing the directory of the dynamic home tile of the virtual domain; and store a mapping unique to each virtual domain in the mapping storage, wherein when a requesting tile of a current domain requests a data item, the dynamic home tile for the data item is determined, and if no tile of the current domain has the data item, the dynamic home tile for the data item is migrated from a different domain to the current domain. 2. The chip multi-processor of claim 1 wherein the mapping storage comprises a base core identifier word (BCID) register and a configuration word (CFW) register in each tile. 3. The chip multi-processor of claim 1 and further comprising: a router in each tile; a plurality of clock sources each connected to provide a router clock signal to at least one router and a core clock signal to at least one core, each clock source responsive to an activity signal indicative of relative activity levels of the at least one router and at least one core to increase and decrease frequencies of the router and core clock signals according to the relative activity levels; and a plurality of power supplies each connected to provide electrical power to at least one router and at least one core, each power supply responsive to an activity signal indicative of relative activity levels of the at least one router and at least one core to increase and decrease the power provided to the at least one router and at least one core according to the relative activity levels. 4. A chip multi-processor with virtual domain management comprising: a plurality of tiles formed in an integrated circuit, each tile comprising: a core, a cache, and mapping storage; a plurality of memory controllers formed in the integrated circuit; a communication bus interconnecting the tiles and the memory controllers; and machine-executable instructions, the tiles and memory controllers responsive to the instructions to: group the tiles into a plurality of virtual domains, each virtual domain associated with at least one of the memory controllers and including a dynamic home tile storing a directory, each virtual domain permitting each tile of the tiles within the virtual domain to access data stored in the cache of any other tile of the tiles of the virtual domain without having to employ directory hardware global to the tiles of the chip multi-processor by referencing the directory of the dynamic home tile of the virtual domain; and store a mapping unique to each virtual domain in the mapping storage of the tiles in that virtual domain, wherein when a requesting tile of a current domain requests a data item, the dynamic home tile for the data item is determined, and if no tile of the current domain has the data item, the dynamic home tile for the data item is migrated from a different domain to the current domain. 5. The chip multi-processor of claim 4 wherein the mapping storage in each tile comprises a base core identifier word (BCID) register and a configuration word (CFW) register. 6. The chip multi-processor of claim 5 wherein the machine instructions provide a page table having a plurality of entries, each entry comprising: a cross-reference between a virtual data item address and a physical data item address; a BCID field; and a CFW field. 7. The chip multi-processor of claim 6 wherein, responsive to a data item request from an application running in a tile in a virtual domain, the machine instructions provide: a look-up of the physical address of the requested item in a translation lookaside buffer (TLB), if the requested item is found in the TLB, a comparison of the BCID and CFW of the tile from which the request originated with the BCID and CFW of the requested item in the TLB; if the comparison indicates a match, a calculation of a dynamic home node to identify a directory that indicates which cache in that virtual domain contains the item and an access of the requested item; and if the comparison does not indicate a match, a calculation of a dynamic home node using the BCID and CFW of the requested item to identify a directory in a remote virtual domain that indicates which cache in the remote virtual domain contains the item, an access of the requested item, and delivery of the requested item across domains to the requesting application. 8. The chip multi-processor of claim 7 wherein, if the requested item is not found in the TLB, the machine instructions provide: a check for a page fault; if a page fault does not occur, placement of the page table entry of the requested item together with the BCID and CFW in the TLB; and if a page fault occurs, if there is no page table entry containing a BCID and CFW of the requested item, entry into the page table of the BCID and CFW of the tile from which the request originated; an update of page table entries of the requested item for any applications related to the requested item if the page is shared by more than one application; and a fetch of the requested item from storage; and if the page table has an entry containing the BCID and CFW of the requested item, a fetch of the requested item from storage. 9. The chip multi-processor of claim 6 wherein the page table includes an entry for a memory page having a dynamic home node that is the same as a static home node for that page. 10. The chip multi-processor of claim 4 wherein the machine instructions provide: monitoring of memory bandwidths of the virtual domains; and if the monitoring shows memory bandwidth starvation in a first one of the virtual domains, a determination of whether there is a second one of the virtual domains with unused memory bandwidth, and if so, a re-allocation of at least one memory page in the memory controllers from the second virtual domain to the first. 11. The chip multi-processor of claim 4 and further comprising: a router in each tile; a plurality of clock sources each connected to provide a router clock signal to at least one router and a core clock signal to at least one core, each clock source responsive to an activity signal indicative of relative activity levels of the at least one router and at least one core to increase and decrease frequencies of the router and core clock signals according to the relative activity levels; and a plurality of power supplies each connected to provide electrical power to at least one router and at least one core, each power supply responsive to an activity signal indicative of relative activity levels of the at least one router and at least one core to increase and decrease the power provided to the at least one router and at least one core according to the relative activity levels. 12. A method of managing a chip multi-processor (CMP) through virtual domains, the CMP having a plurality of tiles and a plurality of m
using directory methods · CPC title
associated with a data cache · CPC title
Non-uniform memory access [NUMA] architecture · CPC title
Two dimensional, e.g. mesh, torus · CPC title
Address translation · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.