What technology area does this patent fall under?

Primary CPC classification G06F9/5044. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Sep 24 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Maximizing resource utilization of neural network computing system

US2020301739A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2020301739-A1
Application number	US-201916358547-A
Country	US
Kind code	A1
Filing date	Mar 19, 2019
Priority date	Mar 19, 2019
Publication date	Sep 24, 2020
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure relates to a method for allocating resources of an accelerator to two or more neural networks for execution. The two or more neural networks may include a first neural network and a second neural network. The method comprises analyzing workloads of the first neural network and the second neural network, wherein the first neural network and second neural network each includes multiple computational layers, evaluating computational resources of the accelerator for executing each computational layer of the first and second neural networks, and scheduling computational resources of the accelerator to execute one computational layer of the multiple computation layers of the first neural network and to execute one or more computational layers of the multiple computational layers of the second neural network.

First claim

Opening claim text (preview).

1 . A method comprising: analyzing workloads of a first neural network and a second neural network, wherein the first neural network and second neural network each includes multiple computational layers; evaluating computational resources of an accelerator for executing each computational layer of the first and second neural networks; and scheduling computational resources of the accelerator to execute one computational layer of the multiple computation layers of the first neural network and to execute one or more computational layers of the multiple computational layers of the second neural network. 2 . The method of claim 1 , wherein the first neural network has a first pipeline interval, wherein an execution time for the one computational layer of the first neural network is shorter than the first pipeline interval, and wherein scheduling computational resources comprises: scheduling the computational resources of the accelerator to execute the one or more computational layers of the second neural network during a time period corresponding to a difference between the first pipeline interval and the execution time. 3 . The method of claim 1 , wherein scheduling computational resources comprises: scheduling the computational resources of the accelerator to execute the one or more computational layers of the second neural network before executing the one computational layer of the first neural network. 4 . The method of claim 1 , wherein evaluating computation resources of the accelerator further comprises: comparing a total amount of computational resources for executing the first and second neural networks with a total amount of available computational resources of the accelerator. 5 . The method of claim 4 , wherein scheduling computational resources of the accelerator is performed when the total amount of computational resources for executing the first and second neural networks is bigger than the total amount of computational resources of the accelerator. 6 . The method of claim 1 , further comprising: detennining a time period that the computational resources assigned for executing the first neural network are not used during execution of the first neural network, wherein the one or more computational layers of the second neural network are executed within the time period. 7 . The method of claim 1 , wherein the first neural network has a longer pipeline interval than the second neural network. 8 . The method of claim 1 , wherein the computational resources of the accelerator are scheduled to execute the one computational layer of the first neural network and the one or more computational layers of the second neural network before executing another computational layer subsequent to the one computation layer of the first neural network. 9 . An apparatus comprising: a memory storing a set of instructions; and one or more processors configured to execute the set of instructions to cause the apparatus to perform: analyzing workloads of a first neural network and a second neural network, wherein the first neural network and second neural network each includes multiple computational layers; evaluating computational resources of an accelerator for executing each computational layer of the first and second neural networks; and scheduling computational resources of the accelerator to execute one computational layer of the multiple computation layers of the first neural network and to execute one or more computational layers of the multiple computational layers of the second neural network. 10 . The apparatus of claim 9 , wherein the first neural network has a first pipeline interval, wherein an execution time for the one computational layer of the first neural network is shorter than the first pipeline interval, and wherein scheduling computational resources comprises: scheduling the computational resources of the accelerator to execute the one or more computational layers of the second neural network during a time period corresponding to a difference between the first pipeline interval and the execution time. 11 . The apparatus of claim 9 , wherein scheduling computational resources comprises: scheduling the computational resources of the accelerator to execute the one or more computational layers of the second neural network before executing the one computational layer of the first neural network. 12 . The apparatus of claim 9 , wherein evaluating computation resources of the accelerator further comprises: comparing a total amount of computational resources for executing the first and second neural networks with a total amount of available computational resources of the accelerator. 13 . The apparatus of claim 9 , wherein scheduling computational resources of the accelerator is performed when the total amount of computational resources for executing the first and second neural networks is bigger than the total amount of computational resources of the accelerator. 14 . The apparatus of claim 9 , wherein the one or more processors are configured to execute the set of instructions to cause the apparatus to further perform: determining a time period that the computational resources assigned for executing the first neural network are not used during execution of the first neural network, wherein the one or more computational layers of the second neural network are executed within the time period. 15 . The apparatus of claim 9 , wherein the first neural network has a longer pipeline interval than the second neural network. 16 . A non-transitory computer readable medium that stores a set of instructions that is executable by at least one processor of a computing device to cause the computing device to perform a method comprising: analyzing workloads of a first neural network and a second neural network, wherein the first neural network and second neural network each includes multiple computational layers; evaluating computational resources of an accelerator for executing each computational layer of the first and second neural networks; and scheduling computational resources of the accelerator to execute one computational layer of the multiple computation layers of the first neural network and to execute one or more computational layers of the multiple computational layers of the second neural network. 17 . The computer readable medium of claim 16 , wherein the first neural network has a first pipeline interval, wherein an execution time for the one computational layer of the first neural network is shorter than the first pipeline interval, and wherein scheduling computational resources comprises: scheduling the computational resources of the accelerator to execute the one or more computational layers of the second neural network during a time period corresponding to a difference between the first pipeline interval and the execution time. 18 . The computer readable medium of claim 16 , wherein scheduling computational resources comprises: scheduling the computational resources of the accelerator to execute the one or more computational layers of the second neural network before executing the one computational layer of the first neural network. 19 . The computer readable medium of claim 16 , wherein evaluating computation resources of the accelerator further comprises: comparing a total amount of computational resources for executing the first and second neural networks with a total amount of available computational resources of the accelerator. 20 . The computer readable medium of

Assignees

Alibaba Group Holding Ltd

Inventors

Classifications

G06N3/045
Combinations of networks · CPC title
G06N3/0499
Feedforward networks · CPC title
G06F9/5083
Techniques for rebalancing the load in a distributed system · CPC title
G06N3/063
using electronic means · CPC title
G06F9/5044Primary
considering hardware capabilities · CPC title

Patent family

Related publications grouped by family.

View patent family 72514326

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2020301739A1 cover?: The present disclosure relates to a method for allocating resources of an accelerator to two or more neural networks for execution. The two or more neural networks may include a first neural network and a second neural network. The method comprises analyzing workloads of the first neural network and the second neural network, wherein the first neural network and second neural network each inclu…
Who is the assignee on this patent?: Alibaba Group Holding Ltd
What technology area does this patent fall under?: Primary CPC classification G06F9/5044. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Sep 24 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).