Method and apparatuses for respectively transferring information within and between system-on-chips (SOCS) via an internal bus and an external bus according to the same communication protocol
US-9535869-B2 · Jan 3, 2017 · US
US12411762B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12411762-B2 |
| Application number | US-202318394442-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 22, 2023 |
| Priority date | Sep 15, 2017 |
| Publication date | Sep 9, 2025 |
| Grant date | Sep 9, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A processor having a functional slice architecture is divided into a plurality of functional units (“tiles”) organized into a plurality of slices. Each slice is configured to perform specific functions within the processor, which may include memory slices (MEM) for storing operand data, and arithmetic logic slices for performing operations on received operand data. The tiles of the processor are configured to stream operand data across a first dimension, and receive instructions across a second dimension orthogonal to the first dimension. The timing of data and instruction flows are configured such that corresponding data and instructions are received at each tile with a predetermined temporal relationship, allowing operand data to be transmitted between the slices of the processor without any accompanying metadata. Instead, each slice is able to determine what operations to perform on received data based upon the timing at which the data is received.
Opening claim text (preview).
What is claimed is: 1. A memory system comprising: a set of memory slices, wherein memory slices of the respective set of memory slices comprise respective memory tiles for data storage; and a set of instruction control circuits configured to provide respective read instructions and respective write instructions for respective threads of a plurality of threads to the respective memory tiles, where respective instruction control circuits of the set of instruction control circuits are located at one end of the respective memory slice, the set of instruction control circuits provide instructions to memory tiles of the respective memory slices, the instructions comprise multiple instruction sets; wherein the processor is a tensor streaming processor. 2. The memory system of claim 1 , wherein the respective read instructions and the respective write instructions comprise instructions for configuring an address generation table. 3. The memory system of claim 1 , wherein the respective read instructions and the respective write instructions comprise instructions for direct references and indirect references. 4. The memory system of claim 1 , wherein the respective read instructions and the respective write instructions comprise power management instructions. 5. The memory system of claim 1 , wherein the respective memory tiles comprise respective memory chips that are organized into a plurality of words that perform a unit of transfer in the memory system. 6. The memory system of claim 5 , wherein words of the plurality of words are configured to store values corresponding to a plurality of lanes of the memory system, and wherein the values are utilized by the memory system to provide data-parallelism to the respective memory tiles. 7. The memory system of claim 5 , wherein the respective memory chips are respective static random-access memory (SRAM) chips, and wherein the plurality of words are a plurality of SRAM words. 8. The memory system of claim 5 , wherein the unit of transfer is an atomic unit of transfer. 9. The memory system of claim 1 , wherein the memory slices of the set of memory slices are divided into the respective threads, and wherein instructions are processed on a per-thread basis. 10. The memory system of claim 9 , wherein the memory slices are divided into respective first threads and respective second threads. 11. The memory system of claim 9 , wherein a first memory slice occupies a first quantity of first tiles on the memory slice, and a second memory slice occupies a second quantity of second tiles on the memory slice. 12. The memory system of claim 11 , wherein the first quantity and the second quantity are different quantities. 13. The memory system of claim 11 , wherein the first quantity and the second quantity are a same quantity. 14. The memory system of claim 11 , wherein the first tiles are contiguous first tiles, and wherein the second tiles are contiguous second tiles. 15. A processor comprising: a set of memory slices, wherein memory slices of the set of memory slices comprise respective memory tiles for data storage; and a set of instruction control circuits configured to provide respective read instructions and respective write instructions for respective threads of a plurality of threads to the respective memory tiles, respective instruction control circuits of the set of instruction control circuits are located at one end of the respective memory slice; wherein the processor is a tensor streaming processor. 16. The processor of claim 15 , wherein the respective read instructions and the respective write instructions comprise instructions selected from a group of instructions consisting of an instruction for configuring an address generation table, an instruction for direct references and indirect references, and power management instructions. 17. The processor of claim 15 , wherein the respective memory tiles comprise respective static random-access memory (SRAM) chips that are organized into a plurality of SRAM words that perform an atomic unit of transfer in the processor. 18. The processor of claim 15 , wherein the memory slices of the set of memory slices are divided into respective first threads and respective second threads, and wherein instructions are processed on a per-thread basis. 19. The processor of claim 18 , wherein a first memory slice occupies a first quantity of first tiles on the memory slice, and a second memory slice occupies a second quantity of second tiles on the memory slice.
from multiple instruction streams, e.g. multistreaming · CPC title
controlled by a single instruction for multiple data lanes [SIMD] · CPC title
Synchronisation and timing concerns (synchronisation on a memory bus G06F13/4234) · CPC title
Implementation provisions of instruction buffers, e.g. prefetch buffer; banks · CPC title
Instruction analysis, e.g. decoding, instruction word fields · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.