Matrix normal/transpose read and a reconfigurable data processor including same

US10768899B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10768899-B2
Application numberUS-201916260548-A
CountryUS
Kind codeB2
Filing dateJan 29, 2019
Priority dateJan 29, 2019
Publication dateSep 8, 2020
Grant dateSep 8, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A configurable circuit configurable according to the data width of elements of a matrix is described that includes a memory array, logic to write a matrix to the memory array having elements with a data width which can be specified using configuration data, logic for a transpose read of the matrix as-written and logic for normal read of the matrix as-written. The memory array includes first and second read ports operable in parallel. Transpose read logic and normal read logic can be coupled to the first and second read ports, respectively, allowing transpose and normal read of a matrix simultaneously.

First claim

Opening claim text (preview).

The invention claimed is: 1. A configurable circuit, comprising: a memory array; logic to write a matrix to the memory array, the matrix having elements with a data width having a number D of bits of data; transpose read logic configurable according to the data width, to output vectors of a transpose read of the matrix; and normal read logic to output vectors of a normal read of the matrix; wherein the transpose read logic and the normal read logic are operable on the memory array to output in parallel respective vectors in transposed and normal orders. 2. The circuit of claim 1 , wherein the memory array includes a first read port and a second read port, and the normal read logic is operably coupled to the first read port and the transpose read logic is operably coupled to the second read port. 3. A configurable circuit, comprising: a memory array; logic to write a matrix to the memory array, the matrix having elements with a data width having a number D of bits of data; transpose read logic configurable according to the data width, to output vectors of a transpose read of the matrix; and normal read logic to output vectors of a normal read of the matrix; wherein the memory array includes a plurality of slots, where slots in the plurality of slots include have a slot width (number of columns) equal to a multiple M greater than 1 of the data width. 4. The circuit of claim 3 , wherein the logic to write comprises logic to organize sets of M rows of the matrix in the memory into a plurality of rows of atoms of M by M elements, atoms in a row of atoms being stored in respective slots in the plurality of slots, and rotated in position in the row of atoms relative to an input matrix as a function of a row number of the row of atoms in the plurality of rows of atoms. 5. The circuit of claim 4 , wherein the transpose read logic includes logic to select atoms in the slots, and store the selected atoms in a reshape circuit, and logic to transpose the selected atoms to form the output vectors of the transpose read of the matrix. 6. The circuit of claim 5 , wherein the reshape circuit includes a FIFO buffer having a depth at least as high as a maximum of M according to the data width of the elements of the matrix, and the logic to transpose the atoms includes a multiplexer tree configurable according to the data width. 7. The circuit of claim 5 , including logic to operate the reshape circuit using double buffering. 8. The circuit of claim 5 , wherein the transpose read logic is configurable for a selected one of a plurality of data types, and in which the data width differs for different data types in the plurality of data types. 9. The circuit of claim 3 , wherein the transpose read logic is configurable for a selected one of a plurality of data types, and the data widths differ for different data types in the plurality of data types, and the slot width is at least two times a maximum of the data width of data types in the plurality of data types. 10. A reconfigurable data processor, comprising: an array of configurable units; and a bus system connected to the array of configurable units which communicates data at a bus clock rate; wherein a configurable unit in the array of configurable units includes: a memory array, logic to write a matrix to the memory array at the bus clock rate, the matrix having elements with a data width having a number D of bits of data; and transpose read logic configurable according to the data width, to output vectors of a transpose read of the matrix at the bus clock rate, wherein the transpose read logic and the normal read logic are operable on the memory array to output in parallel respective vectors in transposed and normal orders. 11. The reconfigurable data processor of claim 10 , including normal read logic configurable according to the data width, to output vectors of a normal read of the matrix at the bus clock rate, and wherein the memory array includes a first read port and a second read port, and the normal read port is operably coupled to the first read port, and the transpose read logic is operably coupled to the second read port. 12. A reconfigurable data processor, comprising: an array of configurable units; and a bus system connected to the array of configurable units which communicates data at a bus clock rate; wherein a configurable unit in the array of configurable units includes: a memory array, logic to write a matrix to the memory array at the bus clock rate, the matrix having elements with a data width having a number D of bits of data; and transpose read logic configurable according to the data width, to output vectors of a transpose read of the matrix at the bus clock rate; wherein the memory array includes a plurality of slots, where slots in the plurality of slots include having a slot width (number of columns) equal to a multiple M greater than 1 of the data width. 13. The reconfigurable data processor of claim 12 , wherein the transpose read logic includes write logic to organize sets of M rows of the matrix in the memory into a plurality of rows of atoms of M by M elements, atoms in a row of atoms being stored in respective slots in the plurality of slots, and rotated in position in the row of atoms relative to an input matrix as a function of a row number of the row of atoms in the plurality of rows of atoms. 14. The reconfigurable data processor of claim 13 , wherein the transpose read logic includes logic to select atoms in the slots, and store the selected atoms in a reshape circuit, the reshape circuit including circuits to transpose the selected atoms to form the output vectors of the transpose read of the matrix. 15. The reconfigurable data processor of claim 14 , wherein the reshape circuit includes a FIFO buffer having a depth at least as high as a maximum of M according to the data type of the elements of the matrix, and the circuit to transpose the atoms comprises a multiplexer tree configurable according to the data width. 16. The reconfigurable data processor of claim 14 , including logic to operate the reshape circuit using double buffering. 17. The reconfigurable data processor of claim 14 , wherein the transpose read logic is configurable for a selected one of a plurality of data types, and in which the data width differs for different data types in the plurality of data types. 18. The reconfigurable data processor of claim 12 , wherein the transpose read logic is configurable for a selected one of a plurality of data types, and in which the data width differs for different data types in the plurality of data types, and the slot width is at least two times a maximum of the data width of data types in the plurality of data types. 19. A memory circuit, comprising: a memory array; write logic to write a matrix to the memory array, the matrix having elements with a data width having a number D of bits of data, wherein the memory array includes a plurality of slots readable in parallel on different rows, where slots in the plurality of slots have a slot width equal to a multiple M of the data width, with logic to organize, when M is greater than 1, sets of M rows of the matrix in the memory array into a plurality of rows of atoms of M by M elements, so that atoms in a row of atoms are stored in respective slots in the plurality of slots, and rotated in position in the row of atoms relative to an input matrix as a function of a row number of the row of atoms in the plurality of rows of atoms; and transpose read logic to output vectors of a transpose read

Assignees

Inventors

Classifications

  • Read-write [R-W] circuits · CPC title

  • Neural networks · CPC title

  • having a sequence of storage locations, the intermediate ones not being accessible for either enqueue or dequeue operations, e.g. using a shift register {(G06F5/065 takes precedence; shift registers per se G11C19/00)} · CPC title

  • G06F7/78Primary

    for changing the order of data flow, e.g. matrix transposition or LIFO buffers; Overflow or underflow handling therefor · CPC title

  • with multidimensional access, e.g. row/column, matrix · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10768899B2 cover?
A configurable circuit configurable according to the data width of elements of a matrix is described that includes a memory array, logic to write a matrix to the memory array having elements with a data width which can be specified using configuration data, logic for a transpose read of the matrix as-written and logic for normal read of the matrix as-written. The memory array includes first and…
Who is the assignee on this patent?
Sambanova Systems Inc
What technology area does this patent fall under?
Primary CPC classification G06F7/78. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 08 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).