Reconfigurable interconnect

US11227086B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11227086-B2
Application numberUS-202015931445-A
CountryUS
Kind codeB2
Filing dateMay 13, 2020
Priority dateJan 4, 2017
Publication dateJan 18, 2022
Grant dateJan 18, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system on a chip (SoC) includes a plurality of processing cores and a stream switch coupled to two or more of the plurality of processing cores. The stream switch includes a plurality of N multibit input ports, wherein N is a first integer, a plurality of M multibit output ports, wherein M is a second integer, and a plurality of M multibit stream links dedicated to respective output ports of the plurality of M multibit output ports. The M multibit stream links are reconfigurably coupleable at run time to a selectable number of the N multibit input ports, wherein the selectable number is an integer between zero and N.

First claim

Opening claim text (preview).

The invention claimed is: 1. A system on a chip (SoC), comprising: a stream switch coupled to two or more of the plurality of processing cores, the stream switch including: a plurality of multibit input ports; a plurality of multibit output ports; and a plurality of multibit stream links dedicated to respective output ports of the plurality of multibit output ports, the multibit stream links being reconfigurably coupleable at run time to a selectable number of the multibit input ports, wherein a multibit stream link of the plurality of multibit stream links includes: stream switch configuration logic arranged to direct a reconfigurable coupling of the dedicated output port according to control register information stored in at least one control register associated with the stream switch; and a plurality of convolution accelerators, each one of the plurality of convolution accelerators being configurable at run time to unidirectionally receive input data via at least two of the plurality of multibit output ports and to unidirectionally communicate output data via one of the plurality of multibit input ports of the stream switch. 2. The system on a chip of claim 1 , wherein a multibit stream link of the plurality of multibit stream links includes: a plurality of data lines arranged to pass streaming data only in a first direction, wherein the first direction is from an input port coupled to the multibit stream link towards its dedicated output port; and a plurality of control lines arranged to pass control data only in the first direction. 3. The system on a chip of claim 1 , wherein each one of the plurality of convolution accelerators includes: a kernel buffer; a feature line buffer; and a multiply-accumulate (MAC) unit module having a plurality of MAC units arranged to multiply data passed from the kernel buffer with data passed from the feature line buffer, the plurality of MAC units further arranged to accumulate products of the multiplication. 4. The system on a chip of claim 1 , further comprising: a first input bus coupling the kernel buffer to a first one of the at least two of the plurality of multibit output ports; and a second input bus coupling the feature line buffer to a second one of the at least two of the plurality of multibit output ports. 5. The system on a chip of claim 4 , wherein each one of the plurality of convolution accelerators includes: an adder tree module arranged to receive and sum data received from the MAC unit module. 6. The system on a chip of claim 5 , further comprising a third input bus coupling the adder tree module to a third one of the at least two of the plurality of multibit output ports, wherein a first convolutional accelerator of the plurality of convolution accelerators is configured to produce intermediate data and the third input bus is configured to pass the intermediate data into the adder tree module of a second convolution accelerator of the plurality of convolution accelerators. 7. The system on a chip of claim 1 , further comprising: a plurality of direct memory access (DMA) engines, each of the DMA engines being configurable at run time to autonomously communicate data into the stream switch or out from the stream switch. 8. The system on a chip of claim 7 , further comprising: a memory device arranged to store kernel data and feature data, wherein selected ones of the plurality of DMA engines are configured to communicate the kernel data and the feature data between the memory and at least one of the plurality of convolution accelerators. 9. A method, the method comprising: configuring at run time a stream switch having a plurality of input ports, a plurality of output ports, and a plurality of stream links available to couple each of the plurality of input ports to any selected one or more of the plurality of output ports, the configuring at run time including: selecting a first input port of the stream switch from the plurality of input ports; selecting a first output port of the stream switch from the plurality of output ports; communicatively coupling the first input port of the stream switch to the selected first output port of the stream switch via a first stream link of the stream switch; passing streaming feature data from a streaming data source device through the first input port, the first stream link, and the first output port to a hardware-based first convolution accelerator; performing convolution operations using at least a portion of the streaming feature data in the first convolution accelerator. 10. The method of claim 9 , further comprising: further configuring the stream switch at run time, the further configuring including: selecting second, third, and fourth input ports of the stream switch; selecting second, third, and fourth output ports of the stream switch; and communicatively coupling, respectively, the second, third, and fourth input ports of the stream switch to the second, third, and fourth output ports of the stream switch via second, third, and fourth stream links of the stream switch; communicatively coupling a kernel data source to the second input port of the stream switch; and communicatively coupling an intermediate data source to the third input port of the stream switch; and communicatively coupling an output of the convolution accelerator to the fourth input port of the stream switch; and unidirectionally passing convolution output data through the fourth input port of the stream switch. 11. The method of claim 10 , wherein the intermediate data source is an output of a hardware-based second convolution accelerator. 12. The method of claim 9 , comprising: reconfiguring the stream switch according to a control message passed into the stream switch via a second input port. 13. The method of claim 9 , comprising: monitoring a first back pressure signal passed back through the first output port and a second back pressure signal passed back through a second output port; in response to the monitoring, reducing the rate of streaming data flow passing through the first input port when either the first back pressure signal or the second back pressure signal is asserted. 14. The method of claim 9 , comprising: merging data streams by automatically switching a reconfigurably coupled output port between two different stream switch input ports according to a fixed pattern. 15. A system on a chip (SoC), comprising: a stream switch including: a plurality of input ports; a plurality of output ports; and a plurality of selection circuits, each selection circuit being coupled to a corresponding one of the plurality of output ports, each selection circuit further coupled to all of the plurality of input ports such that each selection circuit is arranged to reconfigurably couple its corresponding output port to no more than one input port at any given time a plurality of convolution accelerators, wherein a first one of the convolution accelerators is operable, during run time and based on information received from one or more of a kernel buffer and a feature line buffer during run time and processed by the first convolution accelerator, to selectively reconfigure the stream switch from a first configuration, in which a first selection circuit of the plurality of selection circuits electrically connects a first input port of the plurality of input ports to a first output port of the plurality of output ports, to a second configuration in which the first selection circuit electrically connects a second input port of the plurality of input ports to the first output port.

Assignees

Inventors

Classifications

  • Probabilistic or stochastic networks · CPC title

  • Combinations of networks · CPC title

  • Probabilistic graphical models, e.g. probabilistic networks · CPC title

  • Recurrent networks, e.g. Hopfield networks · CPC title

  • Quantised networks; Sparse networks; Compressed networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11227086B2 cover?
A system on a chip (SoC) includes a plurality of processing cores and a stream switch coupled to two or more of the plurality of processing cores. The stream switch includes a plurality of N multibit input ports, wherein N is a first integer, a plurality of M multibit output ports, wherein M is a second integer, and a plurality of M multibit stream links dedicated to respective output ports of …
Who is the assignee on this patent?
St Microelectronics Srl, St Microelectronics Int Nv
What technology area does this patent fall under?
Primary CPC classification G06N3/063. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 18 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).