Multi-agent instruction execution engine for neural inference processing

US12406174B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12406174-B2
Application numberUS-201816161867-A
CountryUS
Kind codeB2
Filing dateOct 16, 2018
Priority dateOct 16, 2018
Publication dateSep 2, 2025
Grant dateSep 2, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Multi-agent instruction execution engines for neural inference processing are provided. In various embodiments, a neural core is provided. The neural core includes an instruction memory. The instruction memory comprises a plurality of instruction streams, each instruction stream associated with one of a plurality of agents. The instruction memory further comprises a plurality of shared functional units. The neural core is adapted to concurrently execute the plurality of instruction streams on the plurality of associated agents. The execution includes maintaining a separate program counter for each of the plurality of agents, determining a plurality of operations from the instructions of each instruction stream, and directing the operations to the shared functional units. The instructions of each instruction stream are statically scheduled prior to runtime to ensure their execution is conflict free.

First claim

Opening claim text (preview).

What is claimed is: 1. A neural core comprising: an instruction memory, the instruction memory comprising a plurality of instruction streams, each instruction stream associated with one of a plurality of agents, each agent of the plurality of agents comprising a control engine and a program counter such that a first agent comprises a first control engine and a first program counter, each control engine of the plurality of control engines performing independent control operations, each program counter comprising a register; and a plurality of shared functional units, wherein the neural core is adapted to concurrently execute the plurality of instruction streams on the plurality of associated agents, wherein the execution comprises: modifying, by the first control engine, the first program counter, determining a plurality of operations from the instructions of each instruction stream, and directing the operations and at least one no-operation instruction to the shared functional units according to a prior modeling of access to the shared functional units and an offline state of each of the separate program counters, the operations being directed from the instruction memory and the at least one no-operation instruction delaying one or more of the plurality of agents to avoid simultaneous agent access to the shared functional units. 2. The neural core of claim 1 , wherein the shared functional units comprise arithmetic, communication, address, and/or computation units. 3. The neural core of claim 1 , wherein the each of the plurality of operations control one of the shared functional units. 4. The neural core of claim 1 , wherein each of the plurality of instruction streams is statically scheduled. 5. The neural core of claim 4 , wherein the static schedule is conflict free. 6. The neural core of claim 5 , wherein the static schedule requires that no two operations access the same shared functional unit simultaneously. 7. The neural core of claim 1 , wherein the plurality of operations are directed to the shared functional units at runtime. 8. The neural core of claim 7 , wherein the plurality of operations are directed to the shared functional units within a sequence of time windows. 9. The neural core of claim 7 , wherein directing the plurality of operations to the shared functional units comprises merging operations from each of the plurality of instruction streams. 10. The neural core of claim 9 , wherein merging operations comprises detecting conflicts between operations directed to the same shared functional unit. 11. The neural core of claim 1 , wherein determining the plurality of operations comprises decoding instructions of each instruction stream. 12. The neural core of claim 1 , adapted to map the plurality of operations to any of the shared functional units. 13. The neural core of claim 1 , wherein the instruction memory is logically segmented. 14. The neural core of claim 1 , wherein the execution is divided into a plurality of cycles. 15. The neural core of claim 1 , further comprising a plurality of parallel data paths, each comprising a subset of the plurality of shared functional units. 16. The neural core of claim 1 , wherein the plurality of agents execute synchronously. 17. The neural core of claim 16 , wherein synchronous execution is provided via a synchronization signal. 18. The neural core of claim 1 , wherein the independent control operations comprise updating one or more loop counter and/or sequence counter. 19. A method comprising: reading a plurality of instruction streams from an instruction memory of a neural core, each instruction stream associated with one of a plurality of agents, each agent of the plurality of agents comprising a control engine and a program counter such that a first agent comprises a first control engine and a first program counter, each control engine of the plurality of control engines performing independent control operations, each program counter comprising a register; concurrently executing the plurality of agents by the neural core; modifying, by the first control engine, the first program counter; determining a plurality of operations from the instructions of each instruction stream; and directing the operations and at least one no-operation instruction to shared functional units of the neural core according to a prior modeling of access to the shared functional units and to an offline state of each of the separate program counters, the operations being directed from the instruction memory and the at least one no-operation instruction delaying one or more of the plurality of agents to avoid simultaneous agent access to the shared functional units. 20. The method of claim 19 , further comprising: computing by the neural core a portion of a neural network layer. 21. A method comprising: executing a plurality of instruction streams, each by one of a plurality of agents, each agent of the plurality of agents comprising a control engine and a program counter such that a first agent comprises a first control engine and a first program counter, each control engine of the plurality of control engines performing independent control operations, each program counter comprising a register; and modifying, by the first control engine, the first program counter, wherein a plurality of shared functional units is controlled by the plurality of instruction streams, the plurality of the shared functional units performing an inference operation, and wherein a prior modeling of access to the shared functional units and offline states of the plurality of program counters are used to avoid simultaneous agent access to the shared functional units by delaying one or more of the plurality of agents using at least one no-operation instruction. 22. The method of claim 21 , wherein the inference operations comprise computation, communication, or memory addressing operations.

Assignees

Inventors

Classifications

  • Feedforward networks · CPC title

  • G06N3/063Primary

    using electronic means · CPC title

  • controlled by a single instruction for multiple data lanes [SIMD] · CPC title

  • Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title

  • Architecture, e.g. interconnection topology · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12406174B2 cover?
Multi-agent instruction execution engines for neural inference processing are provided. In various embodiments, a neural core is provided. The neural core includes an instruction memory. The instruction memory comprises a plurality of instruction streams, each instruction stream associated with one of a plurality of agents. The instruction memory further comprises a plurality of shared function…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06N3/063. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 02 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).