Memory design for a processor

US12411762B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12411762-B2
Application numberUS-202318394442-A
CountryUS
Kind codeB2
Filing dateDec 22, 2023
Priority dateSep 15, 2017
Publication dateSep 9, 2025
Grant dateSep 9, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A processor having a functional slice architecture is divided into a plurality of functional units (“tiles”) organized into a plurality of slices. Each slice is configured to perform specific functions within the processor, which may include memory slices (MEM) for storing operand data, and arithmetic logic slices for performing operations on received operand data. The tiles of the processor are configured to stream operand data across a first dimension, and receive instructions across a second dimension orthogonal to the first dimension. The timing of data and instruction flows are configured such that corresponding data and instructions are received at each tile with a predetermined temporal relationship, allowing operand data to be transmitted between the slices of the processor without any accompanying metadata. Instead, each slice is able to determine what operations to perform on received data based upon the timing at which the data is received.

First claim

Opening claim text (preview).

What is claimed is: 1. A memory system comprising: a set of memory slices, wherein memory slices of the respective set of memory slices comprise respective memory tiles for data storage; and a set of instruction control circuits configured to provide respective read instructions and respective write instructions for respective threads of a plurality of threads to the respective memory tiles, where respective instruction control circuits of the set of instruction control circuits are located at one end of the respective memory slice, the set of instruction control circuits provide instructions to memory tiles of the respective memory slices, the instructions comprise multiple instruction sets; wherein the processor is a tensor streaming processor. 2. The memory system of claim 1 , wherein the respective read instructions and the respective write instructions comprise instructions for configuring an address generation table. 3. The memory system of claim 1 , wherein the respective read instructions and the respective write instructions comprise instructions for direct references and indirect references. 4. The memory system of claim 1 , wherein the respective read instructions and the respective write instructions comprise power management instructions. 5. The memory system of claim 1 , wherein the respective memory tiles comprise respective memory chips that are organized into a plurality of words that perform a unit of transfer in the memory system. 6. The memory system of claim 5 , wherein words of the plurality of words are configured to store values corresponding to a plurality of lanes of the memory system, and wherein the values are utilized by the memory system to provide data-parallelism to the respective memory tiles. 7. The memory system of claim 5 , wherein the respective memory chips are respective static random-access memory (SRAM) chips, and wherein the plurality of words are a plurality of SRAM words. 8. The memory system of claim 5 , wherein the unit of transfer is an atomic unit of transfer. 9. The memory system of claim 1 , wherein the memory slices of the set of memory slices are divided into the respective threads, and wherein instructions are processed on a per-thread basis. 10. The memory system of claim 9 , wherein the memory slices are divided into respective first threads and respective second threads. 11. The memory system of claim 9 , wherein a first memory slice occupies a first quantity of first tiles on the memory slice, and a second memory slice occupies a second quantity of second tiles on the memory slice. 12. The memory system of claim 11 , wherein the first quantity and the second quantity are different quantities. 13. The memory system of claim 11 , wherein the first quantity and the second quantity are a same quantity. 14. The memory system of claim 11 , wherein the first tiles are contiguous first tiles, and wherein the second tiles are contiguous second tiles. 15. A processor comprising: a set of memory slices, wherein memory slices of the set of memory slices comprise respective memory tiles for data storage; and a set of instruction control circuits configured to provide respective read instructions and respective write instructions for respective threads of a plurality of threads to the respective memory tiles, respective instruction control circuits of the set of instruction control circuits are located at one end of the respective memory slice; wherein the processor is a tensor streaming processor. 16. The processor of claim 15 , wherein the respective read instructions and the respective write instructions comprise instructions selected from a group of instructions consisting of an instruction for configuring an address generation table, an instruction for direct references and indirect references, and power management instructions. 17. The processor of claim 15 , wherein the respective memory tiles comprise respective static random-access memory (SRAM) chips that are organized into a plurality of SRAM words that perform an atomic unit of transfer in the processor. 18. The processor of claim 15 , wherein the memory slices of the set of memory slices are divided into respective first threads and respective second threads, and wherein instructions are processed on a per-thread basis. 19. The processor of claim 18 , wherein a first memory slice occupies a first quantity of first tiles on the memory slice, and a second memory slice occupies a second quantity of second tiles on the memory slice.

Assignees

Inventors

Classifications

  • from multiple instruction streams, e.g. multistreaming · CPC title

  • controlled by a single instruction for multiple data lanes [SIMD] · CPC title

  • Synchronisation and timing concerns (synchronisation on a memory bus G06F13/4234) · CPC title

  • Implementation provisions of instruction buffers, e.g. prefetch buffer; banks · CPC title

  • Instruction analysis, e.g. decoding, instruction word fields · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12411762B2 cover?
A processor having a functional slice architecture is divided into a plurality of functional units (“tiles”) organized into a plurality of slices. Each slice is configured to perform specific functions within the processor, which may include memory slices (MEM) for storing operand data, and arithmetic logic slices for performing operations on received operand data. The tiles of the processor ar…
Who is the assignee on this patent?
Groq Inc
What technology area does this patent fall under?
Primary CPC classification G06F12/0292. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 09 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).