Max pooling in a matrix processing architecture
US-2018189238-A1 · Jul 5, 2018 · US
US10896039B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10896039-B2 |
| Application number | US-201916264483-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 31, 2019 |
| Priority date | Dec 30, 2016 |
| Publication date | Jan 19, 2021 |
| Grant date | Jan 19, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In one embodiment, a matrix operation may be performed on one or more matrix operands. For example, matrix data may be received from a multi-dimensional memory, wherein the matrix data is associated with the one or more matrix operands. The one or more matrix operands may be extracted from the matrix data. A matrix routine associated with the matrix operation may be identified. The matrix routine may be executed on a matrix processor using the one or more matrix operands. A result of the matrix operation may be obtained based on the matrix routine executed by the matrix processor.
Opening claim text (preview).
What is claimed is: 1. A matrix processing circuit, comprising: a programmable matrix routine memory comprising circuitry to store a plurality of programmable matrix routines, wherein each of the plurality of programmable matrix routines comprises instructions for performing a corresponding matrix operation of a plurality of matrix operations, and wherein the programmable matrix routine memory is to be programmed with the plurality of programmable matrix routines that are to be executed to perform the plurality of matrix operations; one or more matrix processing units comprising circuitry to perform matrix computations; and a controller comprising circuitry to: receive a command to perform a particular matrix operation of the plurality of matrix operations; identify a programmable matrix routine corresponding to the particular matrix operation, wherein the programmable matrix routine is identified from the plurality of programmable matrix routines; receive the programmable matrix routine from the programmable matrix routine memory; execute the programmable matrix routine, wherein one or more matrix computations associated with execution of the programmable matrix routine are performed using the one or more matrix processing units; and determine a result of the particular matrix operation, wherein the result is determined based on execution of the programmable matrix routine. 2. The matrix processing circuit of claim 1 , wherein the circuitry to execute the programmable matrix routine is to: receive one or more matrix operands associated with the particular matrix operation; and perform, using the one or more matrix processing units, the one or more matrix computations on the one or more matrix operands. 3. The matrix processing circuit of claim 2 , wherein: the matrix processing circuit further comprises a plurality of memory resource blocks; and the circuitry to receive the one or more matrix operands associated with the particular matrix operation is further to: receive matrix data from a memory, wherein the matrix data is associated with the one or more matrix operands; extract the one or more matrix operands from the matrix data; and store the one or more matrix operands in one or more of the plurality of memory resource blocks. 4. The matrix processing circuit of claim 3 , wherein: the programmable matrix routine comprises a set of instructions for performing the particular matrix operation; and the set of instructions comprises: a first subset of instructions to receive the matrix data from the memory and extract the one or more matrix operands from the matrix data; and a second subset of instructions to perform the one or more matrix computations on the one or more matrix operands. 5. The matrix processing circuit of claim 3 , wherein: the one or more matrix processing units comprise a plurality of matrix processing units; and the matrix processing circuit further comprises a matrix processing cluster, wherein the matrix processing cluster comprises the plurality of matrix processing units and the plurality of memory resource blocks. 6. The matrix processing circuit of claim 1 , wherein the controller further comprises circuitry to: receive the plurality of programmable matrix routines from a host computing system; and store the plurality of programmable matrix routines in the programmable matrix routine memory. 7. The matrix processing circuit of claim 1 , wherein the one or more matrix computations comprise one or more matrix multiplication computations. 8. The matrix processing circuit of claim 1 , wherein the one or more matrix computations comprise one or more convolution computations. 9. The matrix processing circuit of claim 1 , wherein the particular matrix operation is associated with an operation for an artificial neural network. 10. A system, comprising: a processor to execute an application, wherein execution of the application comprises a plurality of matrix operations; and matrix processing circuitry to perform the plurality of matrix operations, wherein the matrix processing circuitry comprises: a programmable matrix routine memory comprising circuitry to store a plurality of programmable matrix routines, wherein each of the plurality of programmable matrix routines comprises instructions for performing a corresponding matrix operation of the plurality of matrix operations, and wherein the programmable matrix routine memory is to be programmed with the plurality of programmable matrix routines that are to be executed to perform the plurality of matrix operations; one or more matrix processing clusters, wherein each of the one or more matrix processing clusters comprises a plurality of matrix processing units, wherein the plurality of matrix processing units comprise circuitry to perform matrix computations; and a controller comprising circuitry to: receive a command to perform a particular matrix operation of the plurality of matrix operations; identify a programmable matrix routine corresponding to the particular matrix operation, wherein the programmable matrix routine is identified from the plurality of programmable matrix routines; receive the programmable matrix routine from the programmable matrix routine memory; distribute execution of the programmable matrix routine across the one or more matrix processing clusters; and determine a result of the particular matrix operation, wherein the result is determined based on execution of the programmable matrix routine. 11. The system of claim 10 , wherein the one or more matrix processing clusters comprise: a plurality of matrix processing clusters; and a multi-dimensional mesh interconnect to communicatively couple the plurality of matrix processing clusters. 12. The system of claim 10 , wherein each of the one or more matrix processing clusters further comprises circuitry to: receive one or more matrix operands associated with the particular matrix operation; and perform, using the plurality of matrix processing units, a plurality of matrix computations on the one or more matrix operands. 13. The system of claim 12 , wherein: the matrix processing circuitry further comprises one or more memory modules; each of the one or more matrix processing clusters further comprises a plurality of memory resource blocks; and the circuitry to receive the one or more matrix operands associated with the particular matrix operation is further to: receive matrix data from the one or more memory modules, wherein the matrix data is associated with the one or more matrix operands; extract the one or more matrix operands from the matrix data; and store the one or more matrix operands in one or more of the plurality of memory resource blocks. 14. The system of claim 13 , wherein: the programmable matrix routine comprises a set of instructions for performing the particular matrix operation; and the set of instructions comprises: a first subset of instructions to receive the matrix data from the one or more memory modules and extract the one or more matrix operands from the matrix data; and a second subset of instructions to perform the plurality of matrix computations on the one or more matrix operands. 15. The system of claim 10 , wherein the controller further comprises circuitry to: receive the plurality of programmable matrix routines from the processor; and store the plurality of programmable matrix routines in the programmable matrix routine memory. 16. At least one non-transitory machine accessible storage medium having instructions stored thereon, the inst
using electronic means · CPC title
LOAD or STORE instructions; Clear instruction · CPC title
Arithmetic instructions · CPC title
Backpropagation, e.g. using gradient descent · CPC title
Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization (matrix transposition G06F7/78)} · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.