Temporal SIMT execution optimization through elimination of redundant operations
US-9830156-B2 · Nov 28, 2017 · US
US11182170B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11182170-B2 |
| Application number | US-201915734729-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 6, 2019 |
| Priority date | Jun 8, 2018 |
| Publication date | Nov 23, 2021 |
| Grant date | Nov 23, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A processor having a SIMD architecture, including an array of elementary processors, each elementary processor being associated with an elementary memory cell, a central controller connected to the elementary processors by an instruction bus and a status bus. The central controller transmits a sequence of instructions in a loop, each instruction including a calculation flow indicator. Each elementary processor has an instruction filter that makes it possible to reject or take into account an instruction depending on the identifier it contains. This operating mode makes it possible to emulate a MIMD processor on a SIMD architecture.
Opening claim text (preview).
The invention claimed is: 1. A processor with SIMD architecture comprising a matrix of processing elements, each processing element being associated with a memory cell for storing data to be processed by said processing element, the processor further comprising a central controller, the processing elements being connected to the central controller by a first bus, called an instruction bus, enabling the central controller to transmit instructions in parallel to the processing elements, and by a second bus, called a status bus, enabling the central controller to receive statuses of the various processing elements, wherein: the central controller comprises a memory wherein the tasks to be performed by the various processing elements are stored in the form of a sequence of instructions, the central controller transmitting the sequence of instructions in a loop on the instruction bus, each instruction comprising a computational flow identifier, a computational flow being defined as an ordered list of tasks, each computational flow relating to one or more processing element(s); each processing element comprises an instruction filter and an identifier table, the instruction filter being adapted to extract the computational flow identifier from each instruction received by the processing element and to determine whether the identifier is present in said table, the instruction being stored in a FIFO buffer to be executed by the processing element if yes, and rejected by the processing element if no. 2. The processor with SIMD architecture according to claim 1 , wherein the FIFO buffer is unstacked at each instruction executed by said processing element. 3. The processor with SIMD architecture according to claim 2 , wherein each instruction of a task has an order number indicating its order of execution in the task, the instruction filter of the processing element comprising a counter which is incremented each time the FIFO buffer is unstacked, an instruction being stored in the FIFO buffer only if its flow identifier is present in the table of the processing element and if its order number is equal to the output value of said counter. 4. The processor with SIMD architecture according to claim 1 , wherein the instruction transmission frequency on the instruction bus is substantially higher than the execution frequency of these instructions by the processing elements. 5. The processor with SIMD architecture according to claim 1 , wherein each instruction comprises an instruction pointer and that the processing element comprises a microsequencer connected to a storage memory of a microcode library, the microsequencer sequencing the microinstructions of the microcode pointed to by said instruction pointer. 6. The processor with SIMD architecture according to claim 5 , wherein each processing element is connected to its neighbours with communication links, a communication link between a first processing element and a second processing element connecting a first transmit register of the first processing element to a second receive register of the second processing element and a second transmit register of the second processing element to a receive register of the first processing element. 7. The processor with SIMD architecture according to claim 6 , wherein executing micro-instructions by the first processing element is stopped as long as the first transmit register is not empty. 8. The processor with SIMD architecture according to claim 6 , wherein executing micro-instructions by the second processing element is stopped as long as the second receive register is not full. 9. The processor with SIMD architecture according to claim 6 , wherein the first processing element having completed the execution of a task informs the central controller of it by notification of its status and the second processing element is informed of this status by the central controller. 10. A smart optical sensor comprising a matrix of elementary sensors and a processor with SIMD architecture according to claim 1 , each processing element being associated with a plurality of sensors of said matrix and being adapted to process signals coming from these sensors. 11. The smart optical sensor according to claim 10 , wherein each processing element itself has a SIMD architecture.
controlled by a single instruction for multiple data lanes [SIMD] · CPC title
for changing the speed of data flow, i.e. speed regularising {or timing, e.g. delay lines, FIFO buffers; over- or underrun control therefor (G06F7/78 takes precedence)} · CPC title
comprising an array of processing units with common control, e.g. single instruction multiple data processors (G06F15/82 takes precedence {; for correlation function computation G06F17/15}) · CPC title
Program or instruction counter, e.g. incrementing · CPC title
single instruction multiple data [SIMD] multiprocessors · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.