What technology area does this patent fall under?

Primary CPC classification G06N3/08. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Apr 04 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Specializing neural networks for heterogeneous systems

US11620516B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11620516-B2
Application number	US-201916724849-A
Country	US
Kind code	B2
Filing date	Dec 23, 2019
Priority date	Dec 23, 2019
Publication date	Apr 4, 2023
Grant date	Apr 4, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure advantageously provides a heterogenous system, and a method for generating an artificial neural network (ANN) for a heterogenous system. The heterogenous system includes a plurality of processing units coupled to a memory configured to store an input volume. The plurality of processing units includes first and second processing units. The first processing unit includes a first processor and is configured to execute a first ANN, and the second processing unit includes a second processor and is configured to execute a second ANN. The first and second ANNs respectively include an input layer, at least one processor-optimized hidden layer and an output layer. The second ANN hidden layers are different than the first ANN hidden layers.

First claim

Opening claim text (preview).

What is claimed is: 1. A heterogenous system, comprising: a memory configured to store an input volume having an input width, an input height, an input depth and a plurality of input values, the input depth being determined by a number of input channels; and a plurality of processing units, coupled to the memory, including: a first processing unit, including at least one first processor, configured to execute a first artificial neural network (ANN) including an input layer configured to receive at least a first portion of the input volume, one or more first ANN hidden layers optimized for the first processor, and an output layer; and a second processing unit, including at least one second processor that is different than the first processor, configured to execute a second ANN including an input layer configured to receive at least a second portion of the input volume, one or more second ANN hidden layers optimized for the second processor, and an output layer, the second ANN hidden layers being different than the first ANN hidden layers, where the first ANN output layer generates a first set of normalized probability values or a first set of values, the second ANN output layer generates a second set of normalized probability values or a second set of values, and where the first processing unit is configured to ensemble average the first and second sets of normalized probability values, using respective first and second weights, into a final set of normalized probability values or the first processing unit is configured to concatenate the first and second sets of values into a set of probability values and convert the set of probability values into a set of normalized probability values; and a third processing unit, having at least one third processor that is different than the first processor and the second processor, configured to execute a third ANN including an input layer to receive at least a third portion of the input volume, one or more third ANN hidden layers optimized for the third processor, and an output layer, the third ANN hidden layers being different than the first ANN hidden layers and the second ANN hidden layers, where the first ANN is a first convolutional neural network (CNN) that includes convolutional layers having small and large kernels, activation layers, pooling layers, and fully-connected layers; the second ANN is a second CNN that includes convolutional layers having small and large kernels, activation layers, pooling layers, and fully connected layers, the second CNN convolutional layers having fewer small kernels and more large kernels than the first CNN; and the third ANN is a third CNN that includes convolutional layers having small and large kernels, activation layers, pooling layers, and fully connected layers, the third CNN convolutional layers having fewer small kernels and more large kernels than the first CNN or the second CNN. 2. The heterogenous system of claim 1 , where: the first processing unit is a central processing unit (CPU), the second processing unit is a graphics processing unit (GPU), and the third processing unit is a neural processing unit (NPU). 3. The heterogenous system of claim 1 , where: the small kernel is convolution filter having a size of 3×3 or smaller; and the large kernel is convolution filter having a size of 5×5 or larger. 4. The heterogenous system of claim 1 , where the first processing unit is configured to execute a facial recognition application, the input volume is an image of a face, and the first, second and third ANNs extract facial features from the image. 5. A heterogenous system, comprising: a memory configured to store an input volume having an input width, an input height, an input depth and a plurality of input values, the input depth being determined by a number of input channels; and a plurality of processing units, coupled to the memory, including: a first processing unit, including at least one first processor, configured to execute a first artificial neural network (ANN) including an input layer configured to receive at least a first portion of the input volume, one or more first ANN hidden layers optimized for the first processor, and an output layer; and a second processing unit, including at least one second processor that is different than the first processor, configured to execute a second ANN including an input layer configured to receive at least a second portion of the input volume, one or more second ANN hidden layers optimized for the second processor, and an output layer, the second ANN hidden layers being different than the first ANN hidden layers, the first ANN output layer generates a first set of normalized probability values or a first set of values, the second ANN output layer generates a second set of normalized probability values or a second set of values, and where the first processing unit is configured to ensemble average the first and second sets of normalized probability values, using respective first and second weights, into a final set of normalized probability values or the first processing unit is configured to concatenate the first and second sets of values into a set of probability values and convert the set of probability values into a set of normalized probability values; a third processing unit, having at least one third processor that is different than the first processor and the second processor, configured to execute a third ANN including an input layer to receive at least a third portion of the input volume, one or more third ANN hidden layers optimized for the third processor, and an output layer, the third ANN hidden layers being different than the first ANN hidden layers and the second ANN hidden layers, where: the first ANN output layer generates the first set of normalized probability values, the second ANN output layer generates the second set of normalized probability values, and the third ANN output layer generates a third set of normalized probability values; and the first processing unit is configured to ensemble average the first, second and third sets of normalized probability values, using respective first, second and third weights, into a final set of normalized probability values. 6. The heterogenous system of claim 5 , where the first, second and third weights are 1. 7. The heterogenous system of claim 5 , where the first weight is based on a number of floating point operations per second (FLOPS) for the first processor, the second weight is based on a number of FLOPS for the second processor, and the third weight is based on a number of FLOPS for the third processor. 8. A heterogenous system, comprising: a memory configured to store an input volume having an input width, an input height, an input depth and a plurality of input values, the input depth being determined by a number of input channels; and a plurality of processing units, coupled to the memory, including: a first processing unit, including at least one first processor, configured to execute a first artificial neural network (ANN) including an input layer configured to receive at least a first portion of the input volume, one or more first ANN hidden layers optimized for the first processor, and an output layer; and a second processing unit, including at least one second processor that is different than the first processor, configured to execute a second ANN including an input layer configured to receive at least a second portion of the input volume, one or more second ANN hidden layers optimized for the second processor, and an output layer, the second ANN hidden layers being different than the first ANN hidden layers, the first ANN output layer generates a first set of normalized probability values or a first set of values, the second ANN

Assignees

Advanced Risc Mach Ltd

Inventors

Classifications

G06N3/0985
Hyperparameter optimisation; Meta-learning; Learning-to-learn · CPC title
G06N3/082
modifying the architecture, e.g. adding, deleting or silencing nodes or connections · CPC title
G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title
G06N3/09
Supervised learning · CPC title
G06N3/092
Reinforcement learning · CPC title

Patent family

Related publications grouped by family.

View patent family 76438196

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11620516B2 cover?: The present disclosure advantageously provides a heterogenous system, and a method for generating an artificial neural network (ANN) for a heterogenous system. The heterogenous system includes a plurality of processing units coupled to a memory configured to store an input volume. The plurality of processing units includes first and second processing units. The first processing unit includes a …
Who is the assignee on this patent?: Advanced Risc Mach Ltd
What technology area does this patent fall under?: Primary CPC classification G06N3/08. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Apr 04 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Speech detection and speech recognition

Pipelining to improve neural network inference accuracy

Optimizing inference for deep-learning neural networks in a heterogeneous system

Concurrent training of functional subnetworks of a neural network

Real-time resource usage reduction in artificial neural networks

Data processing performance enhancement for neural networks using a virtualized data iterator

Unit sourcing

Frequently asked questions