Compiler method

US11262787B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11262787-B2
Application numberUS-202016744249-A
CountryUS
Kind codeB2
Filing dateJan 16, 2020
Priority dateOct 20, 2017
Publication dateMar 1, 2022
Grant dateMar 1, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The invention relates to a computer implemented method of generating multiple programs to deliver a computerised function, each program to be executed in a processing unit of a computer comprising a plurality of processing units each having instruction storage for holding a local program, an execution unit for executing the local program and data storage for holding data, a switching fabric connected to an output interface of each processing unit and connectable to an input interface of each processing unit by switching circuitry controllable by each processing unit, and a synchronisation module operable to generate a synchronisation signal, the method comprising: generating a local program for each processing unit comprising a sequence of executable instructions; determining for each processing unit a relative time of execution of instructions of each local program whereby a local program allocated to one processing unit is scheduled to execute with a predetermined delay relative to a synchronisation signal a send instruction to transmit at least one data packet at a predetermined transmit time, relative to the synchronisation signal, destined for a recipient processing unit but having no destination identifier, and a local program allocated to the recipient processing unit is scheduled to execute at a predetermined switch time a switch control instruction to control the switching circuitry to connect its processing unit wire to the switching fabric to receive the data packet at a receive time.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer implemented method of generating multiple programs to be executed in a computer comprising a plurality of processing units, the method comprising: generating a first local program for a first processing unit of the plurality of processing units, the first local program comprising a first sequence of executable instructions; configuring the first local program to transmit a data packet having no destination identifier at a transmit time, relative to a synchronisation signal, and destined for a second processing unit; generating a second local program for the second processing unit of the plurality of processing units, the second local program comprising a second sequence of executable instructions; and scheduling the second local program to control switching circuitry to receive the data packet at a receive time. 2. The method of claim 1 , wherein the first processing unit and the second processing unit have a fixed positional relationship with respect to each other, and the configuring the first local program comprises determining a fixed delay based on the positional relationship between the first processing unit and the second processing unit. 3. The method of claim 2 , wherein the fixed positional relationship comprises an array of rows and columns, wherein the first processing unit has a first identifier which identifies its position in the array, and wherein the second processing unit has a second identifier which identifies its position in the array. 4. The method of claim 1 , wherein the switching circuitry comprises a multiplexer having an output set of wires connected to the second processing unit and multiple sets of input wires connectable to a switching fabric, the multiplexer located on the computer at a physical location with respect to the second processing unit, and wherein the configuring the first local program comprises determining a fixed delay for a switch control instruction to reach the multiplexer and the data packet to reach an input interface of the second processing unit from the multiplexer. 5. The method of claim 1 , further comprising providing in the first local program a synchronisation instruction which indicates that a compute phase at the first processing unit has completed. 6. The method of claim 5 , wherein the configuring the first local program comprises determining for the first processing unit a fixed delay between a synchronisation event on a chip and receiving back at the first processing unit an acknowledgement that the synchronisation event has occurred. 7. The method of claim 1 , wherein the configuring the first local program comprises accessing a look-up table holding information about delays enabling the transmit time at the first processing unit and a switching time at the second processing unit to be determined. 8. The method of claim 1 , wherein the first local program and the second local program deliver a machine learning function. 9. A compiler having a processor programmed to carry out a method of generating multiple programs to deliver a computerised function, each program to be executed in a computer comprising a plurality of processing units, the method comprising: generating a first local program for a first processing unit, the first local program comprising a first sequence of executable instructions; configuring the first local program to execute with a delay relative to a synchronisation signal a send instruction to transmit a data packet having no destination identifier at a transmit time and destined for a second processing unit; and generating a second local program for the second processing unit, including scheduling the second local program to execute at a switch time a switch control instruction to connect the second processing unit to a switching fabric to receive the data packet at a receive time. 10. The compiler of claim 9 , wherein the compiler is configured to receive a fixed graph structure representing the computerised function and a table holding delays enabling the transmit time and the switch time to be determined. 11. The compiler of claim 10 , wherein the fixed graph structure comprises a plurality of nodes, each node being represented by a codelet in the first local program. 12. The compiler of claim 10 , wherein the fixed graph structure comprises a plurality of nodes represented by a codelet in the second local program. 13. The compiler of claim 9 , wherein the configuring the first local program comprises determining a fixed delay based on a positional relationship between the first processing unit and the second processing unit. 14. The compiler of claim 9 , the method further comprising providing in the first local program a synchronisation instruction which indicates that a compute phase at the first processing unit has completed. 15. A computer program recorded on non transmissible media and comprising computer readable instructions which when executed by a processor of a compiler implement a method, the method comprising: generating a first local program for a first processing unit, the first local program comprising a first sequence of executable instructions; configuring the first local program to transmit a data packet having no destination identifier at a transmit time, relative to a synchronisation signal, and destined for a second processing unit; generating a second local program for the second processing unit, the second local program comprising a second sequence of executable instructions; and scheduling the second local program to control switching circuitry to receive the data packet at a receive time. 16. The computer program of claim 15 , wherein the configuring the first local program comprises determining a fixed delay based on a positional relationship between the first processing unit and the second processing unit. 17. The computer program of claim 15 , the method further comprising: providing in the first local program a synchronisation instruction which indicates that a compute phase at the first processing unit has completed. 18. The computer program of claim 15 , wherein the configuring the first local program comprises determining for the first processing unit a fixed delay between a synchronisation event on a chip and receiving back at the first processing unit an acknowledgement that the synchronisation event has occurred. 19. The computer program of claim 15 , wherein the configuring the first local program comprises accessing a look-up table holding information about delays enabling the transmit time at the first processing unit and a switching time at the second processing unit to be determined. 20. The computer program of claim 15 , wherein the method further comprises receiving at the compiler a fixed graph structure representing a computerised function and a table holding delays enabling the transmit time and a switch time to be determined.

Assignees

Inventors

Classifications

  • from multiple instruction streams, e.g. multistreaming · CPC title

  • using a plurality of independent parallel functional units · CPC title

  • using switching circuits, e.g. switching matrix, connection or expansion network (G06F13/4009 takes precedence) · CPC title

  • G06F8/451Primary

    Code distribution (considering CPU load at run-time G06F9/505; load rebalancing G06F9/5083) · CPC title

  • Synchronisation or serialisation instructions · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11262787B2 cover?
The invention relates to a computer implemented method of generating multiple programs to deliver a computerised function, each program to be executed in a processing unit of a computer comprising a plurality of processing units each having instruction storage for holding a local program, an execution unit for executing the local program and data storage for holding data, a switching fabric con…
Who is the assignee on this patent?
Graphcore Ltd
What technology area does this patent fall under?
Primary CPC classification G06F8/451. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 01 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).