Neural network optimization mechanism

US12412086B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12412086-B2
Application numberUS-202117177632-A
CountryUS
Kind codeB2
Filing dateFeb 17, 2021
Priority dateApr 24, 2017
Publication dateSep 9, 2025
Grant dateSep 9, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An apparatus to facilitate optimization of a neural network (NN) is disclosed. The apparatus includes optimization logic to define a NN topology having one or more macro layers, adjust the one or more macro layers to adapt to input and output components of the NN and train the NN based on the one or more macro layers.

First claim

Opening claim text (preview).

What is claimed is: 1. An apparatus comprising: a graphics processing unit (GPU) comprising hardware circuitry to: define a neural network (NN) topology of an NN, the NN topology implemented by the hardware circuitry as having one or more macro layers, wherein the one or more macro layers comprise a first sub-topology of the NN topology having multiple NN layers, wherein each macro layer comprises a topology definition, a scale of input features, and a scale of output features, and wherein each macro layer comprises a set of user interface components; adjust the one or more macro layers to adapt to input and output components of the NN; wrap the one or more macro layers within a first macro stub layer corresponding to a first topology of the NN, wherein the first macro stub layer comprises additional layers comprising at least one of concatenation layers, elementwise operation layers, or merge layers; perform, using processing resources of the hardware circuitry and a first training data set, a first adjustment of weights of the one or more macro layers of the NN corresponding to the first topology of the NN; identify one or more other sub-topologies from the first topology of the NN, wherein the one or more other sub-topologies comprises additional macro layers having different combinations of the multiple NN layers, wherein the one or more other sub-topologies are trained concurrently during a first training; wrap the additional macro layers of each of the one or more other sub-topologies within additional macro stub layers, wherein the additional macro stub layers comprise the additional layers comprising at least one of the concatenation layers, the elementwise operation layers, or the merge layers; perform, a second adjustment of the weights of the additional macro layers wrapped within the additional macro stub layers of the one or more other sub-topologies, wherein the one or more other sub-topologies are trained concurrently during a second training; determine error values incurred during the second training of each sub-topology of the one or more other sub-topologies; identify a target sub-topology having a lowest error value of the error values; and retrain the NN utilizing the target sub-topology as a second topology of the NN to generate an updated NN for use in an inference phase of the NN. 2. The apparatus of claim 1 , wherein the one or more macro layers each comprise the first sub-topology including a plurality of the multiple NN layers. 3. The apparatus of claim 2 , wherein the GPU is to replace the first topology of the NN with the one or more macro layers, and provide an input features node to record output from the first topology. 4. The apparatus of claim 2 , wherein the one or more macro layers comprise a standard set of components to facilitate training. 5. The apparatus of claim 1 , wherein the GPU is to optimize the NN by automatically tuning one or more layers in the NN. 6. The apparatus of claim 5 , wherein automatically tuning the one or more layers comprises automatically constructing the NN based on received performance and accuracy constraints. 7. The apparatus of claim 5 , wherein the GPU is to provide auto-tuning based on one or more statistical algorithms. 8. The apparatus of claim 1 , wherein the GPU is to optimize the NN based on the NN topology of the NN. 9. The apparatus of claim 1 , wherein the GPU is to perform clustering of processing units to process information relating to modalities. 10. The apparatus of claim 9 , further comprising: a first cluster of two or more processing units; a second cluster of two or more processing units; and one or more routers coupled between the first cluster and the second cluster. 11. A method comprising: defining a neural network (NN) topology of an NN, the NN topology implemented by hardware circuitry of a graphics processing unit (GPU) as having one or more macro layers, wherein the one or more macro layers comprise a first sub-topology of the NN topology having multiple NN layers, wherein each macro layer comprises a topology definition, a scale of input features, and a scale of output features, and wherein each macro layer comprises a set of user interface components; adjusting the one or more macro layers to adapt to input and output components of the NN; wrapping the one or more macro layers within a first macro stub layer corresponding to a first topology of the NN, wherein the first macro stub layer comprises additional layers comprising at least one of concatenation layers, elementwise operation layers, or merge layers; performing, using processing resources of the hardware circuitry and a first training data set, a first adjustment of weights of the one or more macro layers of the NN corresponding to the first topology of the NN; identifying one or more other sub-topologies from the first topology of the NN, wherein the one or more other sub-topologies comprises additional macro layers having different combinations of the multiple NN layers, wherein the one or more other sub-topologies are trained concurrently during a first training; wrapping the additional macro layers of each of the one or more other sub-topologies within additional macro stub layers, wherein the additional macro stub layers comprise additional layers comprising at least one of the concatenation layers, the elementwise operation layers, or the merge layers; performing a second adjustment of the weights of the additional macro layers wrapped within the additional macro stub layers of the one or more other sub-topologies, wherein the one or more other sub-topologies are trained concurrently during a second training; determining error values incurred during the second training of each sub-topology of the one or more other sub-topologies; identifying a target sub-topology having a lowest error value of the error values; and retraining the NN utilizing the target sub-topology as a second topology of the NN to generate an updated NN for use in an inference phase of the NN. 12. The method of claim 11 , wherein the one or more macro layers each comprise the first sub-topology including a plurality of the multiple NN layers. 13. The method of claim 12 , further comprising: replacing the first topology of the NN with the one or more macro layers; and providing an input features node to record output from the first topology. 14. The method of claim 13 , wherein the one or more macro layers comprise a standard set of components to facilitate training. 15. The method of claim 11 , further comprising optimizing the NN by automatically tuning one or more layers in the NN. 16. At least one non-transitory computer readable medium having instructions, which when executed by one or more processors, cause the one or more processors to: define a neural network (NN) topology of an NN, the NN topology implemented by hardware circuitry of a graphics processing unit (GPU) comprising the one or more processors as having one or more macro layers, wherein the one or more macro layers comprise a first sub-topology of the NN topology having multiple NN layers, wherein each macro layer comprises a topology definition, a scale of input features, and a scale of output features, and wherein each macro layer comprises a set of user interface components; adjust the one or more macro layers to adapt to input and output components of the NN; wrap the one or more macro layers within a first macro stub layer corresponding to a first topology of the NN, wherein the first macro stub layer comprises additional layers comprising at least one of concatenatio

Assignees

Inventors

Classifications

  • G06N3/082Primary

    modifying the architecture, e.g. adding, deleting or silencing nodes or connections · CPC title

  • Distributed learning, e.g. federated learning · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title

  • Supervised learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12412086B2 cover?
An apparatus to facilitate optimization of a neural network (NN) is disclosed. The apparatus includes optimization logic to define a NN topology having one or more macro layers, adjust the one or more macro layers to adapt to input and output components of the NN and train the NN based on the one or more macro layers.
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06N3/082. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 09 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).