Dimension shuffling using matrix processors

US10949496B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10949496-B2
Application numberUS-201615395906-A
CountryUS
Kind codeB2
Filing dateDec 30, 2016
Priority dateDec 30, 2016
Publication dateMar 16, 2021
Grant dateMar 16, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In one embodiment, a matrix operation may be performed to reorder a plurality of dimensions of an input matrix stored in two-dimensional memory. Data associated with the input matrix may be accessed using one or more strided memory operations, wherein the one or more strided memory operations are configured to access the two-dimensional memory at a plurality of locations that are separated by a particular interval. The data accessed using the one or more strided memory operations may be stored in a result matrix, wherein the data accessed using each strided memory operation is stored in the result matrix in non-transpose form or transpose form.

First claim

Opening claim text (preview).

What is claimed is: 1. An apparatus, comprising: a memory element comprising two-dimensional memory; and a processor to perform a matrix operation to reorder a plurality of dimensions of an input matrix stored in two-dimensional memory, wherein the processor is configured to: access data associated with the input matrix using a plurality of strided memory operations, wherein the plurality of strided memory operations are configured to access the two-dimensional memory at a plurality of locations that are separated by a particular interval, and wherein the plurality of strided memory operations comprise: at least one non-transpose convolutional read, wherein data accessed using the at least one non-transpose convolutional read is to be stored in non-transpose form; and at least one transpose convolutional read, wherein data accessed using the at least one transpose convolutional read is to be stored in transpose form; and store the data accessed using the plurality of strided memory operations in a result matrix, wherein the data accessed using each strided memory operation is stored in the result matrix in non-transpose form or transpose form. 2. The apparatus of claim 1 , wherein the matrix operation comprises a dimension shuffle operation to reorder the plurality of dimensions of the input matrix. 3. The apparatus of claim 1 , wherein the plurality of strided memory operations are configured to access the two-dimensional memory at a plurality of rows that are separated by the particular interval, wherein the particular interval comprises a particular number of rows. 4. The apparatus of claim 1 , wherein the plurality of strided memory operations are configured to access the two-dimensional memory at a plurality of columns that are separated by the particular interval, wherein the particular interval comprises a particular number of columns. 5. The apparatus of claim 1 : wherein the plurality of strided memory operations are each configured to begin accessing the two-dimensional memory at a particular offset; and wherein the processor is further configured to store the data in the result matrix based on an order in which the data is accessed using the plurality of strided memory operations. 6. The apparatus of claim 1 : wherein the non-transpose form comprises a same form in which the data is accessed in the two-dimensional memory; and wherein the transpose form comprises a form in which the data accessed in the two-dimensional memory is transposed. 7. The apparatus of claim 1 , wherein the matrix operation is associated with a convolution operation in a neural network. 8. The apparatus of claim 1 , wherein the matrix operation is associated with a backward propagation operation in a neural network. 9. A method performed by a matrix processor, comprising: performing a matrix operation to reorder a plurality of dimensions of an input matrix stored in two-dimensional memory circuitry, wherein performing the matrix operation comprises: accessing data associated with the input matrix using a plurality of strided memory operations, wherein the plurality of strided memory operations are configured to access the two-dimensional memory circuitry at a plurality of locations that are separated by a particular interval, and wherein the plurality of strided memory operations comprise: at least one non-transpose convolutional read, wherein data accessed using the at least one non-transpose convolutional read is to be stored in non-transpose form; and at least one transpose convolutional read, wherein data accessed using the at least one transpose convolutional read is to be stored in transpose form; and storing the data accessed using the plurality of strided memory operations in a result matrix, wherein the data accessed using each strided memory operation is stored in the result matrix in non-transpose form or transpose form. 10. The method of claim 9 , wherein the matrix operation comprises a dimension shuffle operation to reorder the plurality of dimensions of the input matrix. 11. The method of claim 9 , wherein the plurality of strided memory operations are configured to access the two-dimensional memory circuitry at a plurality of rows that are separated by the particular interval, wherein the particular interval comprises a particular number of rows. 12. The method of claim 9 , wherein the plurality of strided memory operations are configured to access the two-dimensional memory circuitry at a plurality of columns that are separated by the particular interval, wherein the particular interval comprises a particular number of columns. 13. The method of claim 9 : wherein the plurality of strided memory operations each begin accessing the two-dimensional memory circuitry at a particular offset; and wherein the data is stored in the result matrix based on an order in which the data is accessed using the plurality of strided memory operations. 14. A system, comprising: a memory element comprising two-dimensional memory; a plurality of processing elements, comprising: a host processor; one or more matrix processing chips; a plurality of matrix processors associated with the one or more matrix processing chips; wherein a matrix processor of the plurality of matrix processors is to perform a matrix operation to reorder a plurality of dimensions of an input matrix stored in two-dimensional memory, wherein the matrix processor is configured to: access data associated with the input matrix using a plurality of strided memory operations, wherein the plurality of strided memory operations are configured to access the two-dimensional memory at a plurality of locations that are separated by a particular interval, and wherein the plurality of strided memory operations comprise: at least one non-transpose convolutional read, wherein data accessed using the at least one non-transpose convolutional read is to be stored in non-transpose form; and at least one transpose convolutional read, wherein data accessed using the at least one transpose convolutional read is to be stored in transpose form; and store the data accessed using the plurality of strided memory operations in a result matrix, wherein the data accessed using each strided memory operation is stored in the result matrix in non-transpose form or transpose form. 15. At least one non-transitory machine accessible storage medium having instructions stored thereon, the instructions, when executed on a machine, cause the machine to: perform a matrix operation to reorder a plurality of dimensions of an input matrix stored in two-dimensional memory, wherein the instructions that cause the machine to perform the matrix operation further cause the machine to: access data associated with the input matrix using a plurality of strided memory operations, wherein the plurality of strided memory operations are configured to access the two-dimensional memory at a plurality of locations that are separated by a particular interval, and wherein the plurality of strided memory operations comprise: at least one non-transpose convolutional read, wherein data accessed using the at least one non-transpose convolutional read is to be stored in non-transpose form; and at least one transpose convolutional read, wherein data accessed using the at least one transpose convolutional read is to be stored in transpose form; and store the data accessed using the plurality of strided memory operations in a result matrix, wherein the data accessed using each strided memory operation is stored in the result matrix in non-transpose form or transpose form.

Assignees

Inventors

Classifications

  • Movement instructions, e.g. MOVE, SHIFT, ROTATE, SHUFFLE · CPC title

  • Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title

  • for changing the order of data flow, e.g. matrix transposition or LIFO buffers; Overflow or underflow handling therefor · CPC title

  • LOAD or STORE instructions; Clear instruction · CPC title

  • G06F17/16Primary

    Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization (matrix transposition G06F7/78)} · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10949496B2 cover?
In one embodiment, a matrix operation may be performed to reorder a plurality of dimensions of an input matrix stored in two-dimensional memory. Data associated with the input matrix may be accessed using one or more strided memory operations, wherein the one or more strided memory operations are configured to access the two-dimensional memory at a plurality of locations that are separated by a…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06F9/30032. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 16 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).