What technology area does this patent fall under?

Primary CPC classification G06F8/41. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Aug 08 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Compiling a program from a graph

US11720332B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11720332-B2
Application number	US-201916527410-A
Country	US
Kind code	B2
Filing date	Jul 31, 2019
Priority date	Apr 2, 2019
Publication date	Aug 8, 2023
Grant date	Aug 8, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for generating an executable program to run on one or more processor modules. The method comprises: receiving a graph comprising a plurality of data nodes, compute vertices and edges; and compiling the graph into an executable program including one or more types of multi-access instruction each of which performs at least two memory access (load and/or store) operations in a single instruction. The memory on each processor module comprises multiple memory banks whereby the same bank cannot be accessed by different load or store operations in the same instruction. The compilation comprises assigning instances of the multi-access instructions to implement at least some of the graph edges, and allocating the data to memory addresses within different ones of the banks. The allocating is performed subject to one or more constraints, including at least that different load/store operations should not access the same memory bank in the same instruction.

First claim

Opening claim text (preview).

The invention claimed is: 1. A computer-implemented method for generating an executable program to run on a system of one or more processor chips each comprising one or more processor modules, each processor module comprising an execution unit and memory; the method comprising: receiving a graph comprising a plurality of data nodes, a plurality of compute vertices and a plurality of directional edges, each data node representing a data element, each edge representing an input to a compute vertex from a data node or an output from a compute vertex input to a data node or another compute vertex, and each compute vertex representing one or more computations to perform on its input or inputs in order to produce the output or outputs from that compute vertex; compiling the graph into said executable program, the executable program comprising a plurality of machine code instructions, including one or more types of multi-access instruction each of which performs at least two load operations, at least two store operations, or at least one load and one store operation in a single instruction; wherein the memory on each of the processor modules comprises a respective plurality of memory banks; and the compilation comprises assigning instances of said multi-access instructions to implement at least some of said edges, analyzing the graph, and in dependence upon a result of analyzing the graph, allocating the data elements to memory addresses within different ones of the banks, wherein the allocating applies one or more constraints including at least a first constraint that no two edges can access the same memory bank at the same time; wherein allocating the data elements comprises: grouping the data elements into equivalence classes, wherein a first equivalence class includes a first set of the data elements that interfere with a second set of other ones of the data elements of a second equivalence class; generating ordered equivalence classes at least in part by ordering the equivalence classes according to a metric; and stepping through the data elements within the ordered equivalence classes to allocate each of the data elements in turn, wherein the metric comprises a first metric, the method further comprising: ordering the data elements within each equivalence class according to a second metric that is different from the first metric; and for each equivalence class, stepping through the ordered data elements within a respective equivalence class to allocate each of the data elements in turn. 2. The method of claim 1 , wherein the constraints further comprise an additional constraint on each of one or more of the edges outputting from a compute vertex, specifying that the data output by the edge should be stored with a specified alignment with respect to the memory addresses. 3. The method of claim 1 , wherein the constraints further comprise an additional constraint on each of one or more of the edges outputting from a compute vertex, specifying that the data output by the edge should be stored in a specified subset of the memory banks. 4. The method of claim 3 , wherein the specified subset comprises a region of interleaved memory. 5. The method of claim 1 , wherein at least one of the vertices comprises a loop, and the constraints further comprise an additional constraint on at least one of the edges outputting from the loop, specifying that an overspill region is left beyond an end of the memory addresses in which the data output by the edge is to be stored. 6. The method of claim 1 , wherein the multi-access instructions include at least a load-store instruction which performs a load operation and a store operation in the same instruction. 7. The method of claim 1 , wherein the multi-access instructions include at least a double-load instruction which performs two load operations in the same instruction. 8. The method of claim 1 , wherein the multi-access instructions include at least a load-load-store instruction which performs two load operations and a store operation in a single instruction. 9. The method of claim 1 , wherein the metric comprises one or more items selected from a list consisting of: a total data size of each of the equivalence classes, a number of data elements in each of the equivalence classes, a time for which each of the equivalence classes will be live, and a total number of lines of machine code for which each of the equivalence classes will be live. 10. The method of claim 1 , wherein analyzing the graph comprises determining which of the data elements will conflict with one another when the executable program is run; and wherein satisfying the first constraint includes allocating the data elements to memory such that any of the data elements that are determined by analyzing the graph to conflict with one another are allocated to different ones of the memory banks. 11. The method of claim 1 , wherein analyzing the graph comprises determining from the graph which of the data elements will be live at the same time when the executable program is run, wherein each data element is live between being written to memory and being used by the program; and wherein satisfying the first constraint includes allocating the data elements to memory such that any of the data elements that are determined by the analysis of the graph to be live at the same time when the executable program is run are allocated to different ones of the memory banks. 12. The method of claim 1 , wherein the first metric comprises variable size, and wherein the second metric comprises liveness time. 13. The method of claim 1 , wherein the first metric comprises liveness time, and wherein the second metric comprises variable size. 14. A non-transitory machine readable medium having stored thereon instructions for performing a method comprising machine executable code which when executed by at least one computer, causes the computer to: compile a graph into an executable program having machine code instructions, wherein the graph comprises a plurality of data nodes representing data elements, a plurality of compute vertices and a plurality of directional edges, further wherein the executable program has a multi-access instruction performing at least one item selected from the list consisting of: a plurality of load operations, a plurality of store operations, and a load and a store operation; wherein the machine executable code for compiling the graph causes the computer to: assign instances of the multi-access instructions to implement the edges, and analyze the graph, and in dependence upon a result of analyzing the graph, allocate the data elements to memory addresses within different memory banks, wherein allocating the data elements applies a first constraint that no two edges can access a same memory bank at a same time; wherein allocating the data elements comprises: grouping the data elements into equivalence classes, wherein a first equivalence class includes a first set of the data elements that interfere with a second set of other ones of the data elements of a second equivalence class; generating ordered equivalence classes at least in part by ordering the equivalence classes according to a metric; and stepping through the data elements within the ordered equivalence classes to allocate each of the data elements in turn, wherein the metric comprises a first metric, and wherein the machine executable code further causes the computer to: order the data elements within each equivalence class according to a second metric that is different from the first metric; and for each equivalence class, step through the ordered data elements wi

Assignees

Graphcore Ltd

Inventors

Classifications

G06F8/41Primary
Compilation · CPC title
G06F8/453Primary
Data distribution · CPC title
G06F8/451
Code distribution (considering CPU load at run-time G06F9/505; load rebalancing G06F9/5083) · CPC title
G06F8/433
Dependency analysis; Data or control flow analysis · CPC title
G06F8/4441
Reducing the execution time required by the program code · CPC title

Patent family

Related publications grouped by family.

View patent family 66443021

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11720332B2 cover?: A method for generating an executable program to run on one or more processor modules. The method comprises: receiving a graph comprising a plurality of data nodes, compute vertices and edges; and compiling the graph into an executable program including one or more types of multi-access instruction each of which performs at least two memory access (load and/or store) operations in a single inst…
Who is the assignee on this patent?: Graphcore Ltd
What technology area does this patent fall under?: Primary CPC classification G06F8/41. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Aug 08 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Partitioning sparse matrices based on sparse matrix representations for crossbar-based architectures

Microprocessor system and method therefor

Method of communicating data packets within data communication systems

Processors Supporting Endian Agnostic SIMD Instructions and Methods

Collaborative filtering in directed graph

Scalable software stack

Database storage reclaiming program

Frequently asked questions