Embedding Rings on a Toroid Computer Network
US-2021349847-A1 · Nov 11, 2021 · US
US11645225B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11645225-B2 |
| Application number | US-202217818855-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 10, 2022 |
| Priority date | Mar 27, 2019 |
| Publication date | May 9, 2023 |
| Grant date | May 9, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A computer, including a plurality of processing nodes arranged in two-dimensional arrays in respective front and rear layers. Each processing node has a set of activatable links. When activated, transmission of data items between the nodes connected via the activated link is enabled. When not activated, transmission of data items between the nodes is prevented. The set of activatable links including a respective link which connects the processing node to each adjacent node in the array, and to a facing processing node in the other layer. An allocation engine is configured to receive an allocation instruction and connected to the processing nodes to selectively activate the links in a configuration.
Opening claim text (preview).
The invention claimed is: 1. A computer comprising a plurality of processing nodes arranged in respective front and rear layers, each layer comprising a two-dimensional array of processing nodes, each processing node having a set of activatable links which, when activated, enable a transmission of data items between the processing node and an adjacent processing node connected via an activated link and, when not activated, prevent the transmission of data items between the processing node and the adjacent processing node connected via an inactive link, the set of activatable links comprising for each processing node in a first layer and a second layer a respective link which connects the processing node to each adjacent node in the two-dimensional array, and to a facing processing node in the second layer or the first layer respectively; and an allocation engine configured to receive an allocation instruction and connected to the processing nodes to selectively activate the links to connect at least a group of the processing nodes in a configuration in which: (i) links between adjacent nodes within each of the first layer and the second layer respectively are activated; (ii) links between facing nodes are only activated for edge processing nodes of the group; and (iii) links between processing nodes outside the group and adjacent processing nodes in the group are deactivated. 2. The computer according to claim 1 wherein the set of activatable links comprises two such respective links connecting the processing node to its facing processing node. 3. The computer according to claim 2 wherein in the configuration two links are activated between corner facing nodes of the group. 4. The computer according to claim 1 wherein the links are bi-directional links. 5. The computer according to claim 1 , wherein the two-dimensional array is an array of n by m processing nodes, and wherein the group comprises an array of p×q processing nodes in the first layer where at least one condition is satisfied: p is less than n or q is less than m. 6. The computer according to claim 5 where m equals n. 7. The computer according to claim 5 where p equals q. 8. The computer according to claim 1 wherein each link when activated has a fixed power requirement independent of data traffic. 9. The computer according to claim 1 wherein each link when deactivated consumes no power. 10. The computer according to claim 1 wherein the allocation engine comprises one or more processor configured to execute allocation computer code responsive to a user request. 11. A method of configuring a computer comprising a plurality of processing nodes arranged in respective front and rear layers, each layer comprising a two-dimensional array of processing node, each processing node having a set of activatable links which, when activated, enable a transmission of data items between the processing node and an adjacent processing node connected via an activated link and, when not activated, prevent the transmission of data items between the processing node and the adjacent processing node connected via an inactive link, the set of activatable links comprising for each processing node in a first layer and a second layer a respective link which connects the processing node to each adjacent node in the array, and to a facing processing node in the second layer or the first layer respectively, the method comprising: selectively activating the links of each processing node in at least a group of the processing nodes to generate a networked configuration of processing nodes in which: (i) links between adjacent nodes within each of the first layer and the second layer respectively activated; (ii) links between facing nodes are only activated for edge processing nodes of the group; and (iii) links between processing nodes outside the group and adjacent the processing nodes are deactivated. 12. The method according to claim 11 selectively activating a link comprises providing power to a link, wherein the links have a power requirement independent of transmitted traffic. 13. The method according to claim 12 wherein the links operate to transmit data by a variation in voltage from a powered voltage level on the link. 14. The method according to claim 11 comprising the further step of operating the group of the processing nodes in the networked configuration using m rings in each of two dimensions, where each ring is formed by n nodes, where n is a number of edge processing nodes in the networked configuration. 15. The method according to claim 14 comprising dividing a partial vector generated at each processing node of the networked configuration into fragments and implementing logical rings for the fragments in the partial vector to implement an Allreduce collective. 16. The method according to claim 15 wherein an Allreduce collective is implemented by a reduce-scatter collective followed by an Allgather collective in the logical rings. 17. The method according to claim 15 comprising implementing the logical rings in forwards and backwards directions in each dimension. 18. The method according to claim 16 comprising implementing the logical rings in forwards and backwards directions in each dimension. 19. A computer comprising a plurality of processing nodes arranged in respective front and rear layers, each layer comprising a two-dimensional array of processing nodes, each processing node having a set of activated links which enable a transmission of data items between the processing node and an adjacent processing node connected via an activated link, wherein the processing nodes are connected in a configuration in which: (i) adjacent nodes within each of a first layer and a second layer are connected by activated links; (ii) edge-processing nodes in each of the first layer and the second layer are connected by activated links to their facing node in a corresponding layer; and (iii) any links between additional processing nodes outside the configuration and the processing nodes connected in the configuration are deactivated such that the transmission of data items is prevented between the processing nodes connected in the configuration and the additional processing nodes outside the configuration. 20. The computer according to claim 19 , wherein the processing nodes of the configuration form a set of connected rings in each of X and Y directions, wherein each ring comprises a same number of processing nodes.
Parallel communications techniques, e.g. gather, scatter, reduce, roadcast, multicast, all to all · CPC title
Three dimensional, e.g. hypercubes · CPC title
Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
Intercommunication techniques · CPC title
One dimensional, e.g. linear array, ring · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.