Network computer with two embedded rings

US11625356B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11625356-B2
Application numberUS-202117211232-A
CountryUS
Kind codeB2
Filing dateMar 24, 2021
Priority dateMar 26, 2020
Publication dateApr 11, 2023
Grant dateApr 11, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer comprising a plurality of interconnected processing nodes arranged in a configuration in which multiple layers of interconnected nodes are arranged along an axis, each layer comprising at least four processing nodes connected in a non-axial ring by at least respective intralayer link between each pair of neighbouring processing nodes, wherein each of the at least four processing nodes in each layer is connected to a respective corresponding node in one or more adjacent layer by a respective interlayer link, the computer being programmed to provide in the configuration two embedded one dimensional paths and to transmit data around each of the two embedded one dimensional paths, each embedded one dimensional path using all processing nodes of the computer in such a manner that the two embedded one dimensional paths operate simultaneously without sharing links.

First claim

Opening claim text (preview).

The invention claimed is: 1. A computer comprising: a plurality of interconnected processing nodes arranged in a configuration in which multiple layers of interconnected nodes are arranged along an axis, each layer comprising at least four processing nodes connected in a non-axial ring by at least one respective intralayer links between each pair of neighbouring processing nodes, wherein each of the at least four processing nodes in each layer is connected to a respective corresponding node in one or more adjacent layer by a respective interlayer link, the computer being programmed to provide in the configuration two embedded one-dimensional paths and to transmit data around each of the two embedded one-dimensional paths, each embedded one-dimensional path using all processing nodes of the computer in such a manner that the two embedded one-dimensional paths operate simultaneously without sharing links, wherein the multiple layers comprise first and second endmost layers and at least one intermediate layer between the first and second endmost layers, wherein each processing node in the first endmost layer is connected to a non-neighbouring node in the first endmost layer in addition to its neighbouring node, and each processing node in the second endmost layer is connected to a non-neighbouring node in the second endmost layer in addition to its neighbouring node, and wherein at least one of the interlayer and intralayer links of processing nodes in the first endmost layer comprise switching circuitry operable to disconnect the processing node from its corresponding node in the second endmost layer and connect it to a non-neighbouring node in the first endmost layer. 2. The computer of claim 1 , wherein the configuration is a toroid configuration in which respective connected corresponding nodes of the multiple layers form at least four axial rings. 3. The computer of claim 1 wherein at least one of the interlayer and intralayer links comprise switching circuitry operable to connect one of the processing nodes selectively to one of multiple other processing nodes. 4. The computer of claim 1 , wherein each processing node is configured to output data on its respective intralayer and interlayer links with the same bandwidth utilisation on each of the intralayer and interlayer links of the processing node. 5. The computer of claim 1 , wherein each layer of the multiple layers has exactly four nodes. 6. The computer of claim 1 which comprises a number of layers arranged along the axis which is greater than the number of processing nodes in each layer. 7. The computer of claim 1 which comprises a number of layers arranged along the axis which is the same as the number of nodes in each layer. 8. The computer of claim 1 wherein the intralayer and interlayer links comprise fixed connections between the processing nodes. 9. The computer of claim 1 wherein at least one of the interlayer links of processing nodes in the first endmost layer comprise switching circuitry operable to disconnect the processing node from its neighbouring node in the first endmost layer and connect it to a corresponding node in the second endmost layer. 10. The computer of claim 1 wherein each embedded one-dimensional path comprises alternating sequences of one of the interlayer links and one of the intralayer links. 11. The computer of claim 1 in which each one-dimensional embedded path comprises a sequence of processing nodes which are visited in a direction in each layer which is the same in all layers within each one-dimensional path. 12. The computer of claim 1 in which each one-dimensional embedded path comprises a sequence of processing nodes which are visited in a direction in each layer which is different in successive layers within each one-dimensional path. 13. The computer of claim 1 comprising six layers, each having four processing nodes connected in a non-axial ring. 14. The computer of claim 1 which comprises eight layers, each having eight processing nodes connected in a non-axial ring. 15. The computer of claim 1 which comprises eight layers each having four processing nodes connected in a ring. 16. The computer of claim 1 which comprises four layers, each having four processing nodes connected in a ring. 17. A computer comprising: a plurality of interconnected processing nodes arranged in a configuration in which multiple layers of interconnected nodes are arranged along an axis, each layer comprising at least four processing nodes connected in a non-axial ring by at least one respective intralayer links between each pair of neighbouring processing nodes, wherein each of the at least four processing nodes in each layer is connected to a respective corresponding node in one or more adjacent layer by a respective interlayer link, the computer being programmed to provide in the configuration two embedded one-dimensional paths and to transmit data around each of the two embedded one-dimensional paths, each embedded one-dimensional path using all processing nodes of the computer in such a manner that the two embedded one-dimensional paths operate simultaneously without sharing links, wherein each processing node is programmed to divide a respective partial vector of that processing node into fragments and to transmit the data in the form of successive fragments around each embedded one-dimensional path. 18. The computer of claim 17 which is programmed to operate each path as a set of logical rings, wherein the successive fragments are transmitted around each logical ring in simultaneous transmission steps. 19. The computer of claim 17 , wherein each processing node is configured to output a respective fragment on each of two links simultaneously, wherein the fragment output on each of the links has approximately the same size. 20. The computer of claim 17 , wherein each processing node is configured to reduce multiple incoming fragments with multiple respective corresponding locally stored fragments. 21. The computer of claim 20 , wherein each processing node is configured to transmit fully reduced fragments on each of its intralayer and interlayer links simultaneously in an Allgather phase of an Allreduce collective. 22. The computer of claim 1 , programmed to transmit the data in data transmission steps such that each link of a processing node is utilised with the same bandwidth as other links of that processing node in each data transmission step. 23. A method of generating a set of programs to be executed in parallel on a computer comprising a plurality of processing nodes connected in a configuration with multiple layers arranged along an axis, each layer comprising at least four processing nodes connected in a non-axial ring by a respective intralayer link between each pair of neighbouring processing nodes, wherein processing nodes in each layer are connected to respective corresponding nodes in each adjacent layer by an interlayer link, the method comprising: generating a first data transmission instruction for a first program to define a first data transmission stage in which data is transmitted from a first node executing the first program, wherein the first data transmission instruction comprises a first link identifier which defines a first outgoing link on which data is to be transmitted from the first node in the first data transmission stage; generating a second data transmission instruction for a second program to define a second data transmission stage in whi

Assignees

Inventors

Classifications

  • One dimensional, e.g. linear array, ring · CPC title

  • Two dimensional, e.g. mesh, torus · CPC title

  • Parallel communications techniques, e.g. gather, scatter, reduce, roadcast, multicast, all to all · CPC title

  • Three dimensional, e.g. hypercubes · CPC title

  • Electrical coupling · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11625356B2 cover?
A computer comprising a plurality of interconnected processing nodes arranged in a configuration in which multiple layers of interconnected nodes are arranged along an axis, each layer comprising at least four processing nodes connected in a non-axial ring by at least respective intralayer link between each pair of neighbouring processing nodes, wherein each of the at least four processing node…
Who is the assignee on this patent?
Graphcore Ltd
What technology area does this patent fall under?
Primary CPC classification G06F15/17381. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 11 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).