Processor architecture
US-11243880-B1 · Feb 8, 2022 · US
US11416165B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11416165-B2 |
| Application number | US-201816160482-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 15, 2018 |
| Priority date | Oct 15, 2018 |
| Publication date | Aug 16, 2022 |
| Grant date | Aug 16, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The present disclosure is directed to systems and methods of implementing a neural network using in-memory, bit-serial, mathematical operations performed by a pipelined SRAM architecture (bit-serial PISA) circuitry disposed in on-chip processor memory circuitry. The on-chip processor memory circuitry may include processor last level cache (LLC) circuitry. The bit-serial PISA circuitry is coupled to PISA memory circuitry via a relatively high-bandwidth connection to beneficially facilitate the storage and retrieval of layer weights by the bit-serial PISA circuitry during execution. Direct memory access (DMA) circuitry transfers the neural network model and input data from system memory to the bit-serial PISA memory and also transfers output data from the PISA memory circuitry to system memory circuitry. Thus, the systems and methods described herein beneficially leverage the on-chip processor memory circuitry to perform a relatively large number of vector/tensor calculations without burdening the processor circuitry.
Opening claim text (preview).
What is claimed: 1. A system comprising: processor circuitry; on-chip processor memory circuitry that includes a plurality of static random access memory (SRAM) arrays, each of the SRAM arrays including microcontroller circuitry; and neural network control circuitry to: receive instructions that include data representative of a multi-layer neural network model and neural network input data; form serially coupled, bit-serial, pipelined static random access memory architecture (bit-serial PISA) circuitry using at least a portion of the plurality of SRAM arrays, each of the SRAM arrays included in the portion of the plurality of SRAM arrays to determine a single layer of the multi-layer neural network model; cause a transfer of one or more subsets of the instructions, each of the one or more subsets representative of a layer of the multi-layer neural network model, to the microcontroller circuitry in a respective one of the portion of the plurality of SRAM arrays; cause, via one or more high-bandwidth connections, a bidirectional transfer of neural network layer weights between PISA memory circuitry and the portion of the plurality of SRAM arrays included in the serially coupled bit-serial PISA circuitry; cause a transfer of the neural network input data from the PISA memory circuitry to the serially coupled bit-serial PISA circuitry; and cause a transfer of output data from the serially coupled bit-serial PISA circuitry to the PISA memory circuitry. 2. The system of claim 1 wherein each of the plurality of SRAM arrays comprises a SRAM array having integer compute capability (C-SRAM) using bit-serial, in-memory, processing. 3. The system of claim 1 wherein the on-chip processor memory circuitry comprises last level cache (LLC) memory. 4. The system of claim 1 wherein the system comprises a multi-chip module that includes the processor circuitry, the on-chip processor memory circuitry, and the neural network control circuitry. 5. The system of claim 1 wherein the system comprises a central processing unit that includes the processor circuitry and the on-chip processor memory circuitry. 6. A non-transitory machine-readable storage medium having instructions that, when executed by neural network control circuitry, cause the neural network control circuitry to: receive, from communicably coupled processor circuitry, an instruction set architecture (ISA) that includes a multi-layer neural network model and neural network input data; serially couple a plurality of static random access memory (SRAM) arrays included in on-chip processor memory circuitry to provide bit-serial pipelined SRAM architecture (bit-serial PISA) circuitry, each of the plurality of SRAM arrays to determine a single layer of the multi-layer neural network model and including respective microcontroller circuitry; cause a transfer of the ISA representative of each layer of the multi-layer neural network model to the microcontroller circuitry in a respective one of the plurality of SRAM arrays; cause a bidirectional transfer of neural network layer weights between each of the serially connected SRAM arrays forming the bit-serial PISA circuitry and PISA memory circuitry coupled to the bit-serial PISA circuitry via one or more high-bandwidth connections; cause a transfer of the ISA representative of the neural network input data from the PISA memory circuitry to the bit-serial PISA circuitry; cause the bit-serial PISA circuitry to perform bit-serial, in-memory, neural network processing using the plurality of SRAM arrays; and cause a transfer of neural network output data from the bit-serial PISA circuitry to the PISA memory circuitry. 7. The non-transitory machine-readable storage medium of claim 6 wherein the instructions further cause the neural network control circuitry to: cause direct memory access (DMA) control circuitry to transfer the neural network output data from the PISA memory circuitry to system memory circuitry. 8. The non-transitory machine-readable storage medium of claim 6 wherein the instructions that cause the neural network control circuitry to serially couple a plurality of static random access memory (SRAM) arrays included in on-chip processor memory circuitry to provide pipelined SRAM architecture (bit-serial PISA) circuitry further cause the neural network control circuitry to: serially couple a plurality of static random access memory (SRAM) arrays included in last level cache (LLC) circuitry coupled to the processor circuitry to provide the bit-serial PISA circuitry. 9. An in-memory neural network processing system, comprising: means for receiving an instruction set architecture (ISA) from processor circuitry, the ISA including a multi-layer neural network model and neural network input data; means for serially coupling a plurality of static random access memory (SRAM) arrays included in on-chip processor memory circuitry to provide bit-serial pipelined SRAM architecture (bit-serial PISA) circuitry, each of the plurality of SRAM arrays representing a single layer of the multi-layer neural network model and including respective microcontroller circuitry; means for causing a transfer of the ISA representative of each layer of the multi-layer neural network model to the microcontroller circuitry in a respective one of the plurality of SRAM arrays; means for causing a bidirectional transfer of neural network layer weights between each of the serially coupled SRAM arrays forming the bit-serial PISA circuitry and PISA memory circuitry coupled to the bit-serial PISA circuitry via one or more high-bandwidth connections; means for causing a transfer of the ISA representative of the neural network input data from the PISA memory circuitry to the bit-serial PISA circuitry; means for causing the bit-serial PISA circuitry to perform bit-serial, in-memory, neural network processing using the plurality of SRAM arrays; and means for causing a transfer of neural network output data from the bit-serial PISA circuitry to the PISA memory circuitry; wherein the means for receiving, the means for serially coupling, the means for causing a transfer of the ISA representative, the means for causing a bidirectional transfer, the means for causing the bit-serial PISA circuitry to perform and the means for causing a transfer of neural network output data comprise hardware circuitry. 10. The system of claim 9 , further comprising: means for causing a direct memory access (DMA) transfer of the neural network output data from the PISA memory circuitry to system memory circuitry. 11. The system of claim 10 , further comprising: means for receiving the data representative of the multi-layer neural network model and the neural network input values in a high-level language; and means for compiling the received data representative of the multi-layer neural network model and the neural network input values from the high-level language to the ISA. 12. The system of claim 11 wherein the means for compiling the received data representative of the multi-layer neural network model and the neural network input values from the high-level language to the ISA comprises: means for compiling the received data representative of the multi-layer neural network model and the neural network input values from the high-level language to an intermediate domain specific language (DSL); and means for compiling the received data representative of the multi-layer neural network model and the neural network input values from the DSL to the ISA. 13. The system of claim 9 wherein the means for serially coupling a plurality of static random access memory (SRAM) arrays included in on-chip pro
Recurrent networks, e.g. Hopfield networks · CPC title
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
Plurality of storage devices · CPC title
using electronic means · CPC title
Vertical data movement, i.e. input-output transfer; data movement between one or more hosts and one or more storage devices · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.