What technology area does this patent fall under?

Primary CPC classification G06N3/084. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Sep 15 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

Method and System for Training a Neural Network

US2016267380A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2016267380-A1
Application number	US-201514657414-A
Country	US
Kind code	A1
Filing date	Mar 13, 2015
Priority date	Mar 13, 2015
Publication date	Sep 15, 2016
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Training a neural network is a time consuming and computationally expensive task. Embodiments provide efficient methods and systems for neural network training One example embodiment is implemented by a plurality of agents, where each agent performs a pipelined gradient analysis to update respective local models of the neural network using respective subsets of data from a common pool of training data. In turn, a common global model of the neural network is updated based upon the local models.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method of training a neural network, the method comprising: by each agent of a plurality of agents, performing a pipelined gradient analysis to update respective local models of a neural network using respective subsets of data from a common pool of training data; and updating a common global model of the neural network based upon the local models. 2 . The method of claim 1 wherein performing the pipelined gradient analysis comprises: splitting the respective local models of the neural network into consecutive chunks; and assigning each chunk to a stage of a pipeline. 3 . The method of claim 1 wherein each stage of the pipeline is associated with a graphics processing unit (GPU). 4 . The method of claim 1 wherein performing the pipelined gradient analysis further comprises: selecting the subsets of data from the common pool of training data according to a focused-attention back-propagation (FABP) strategy. 5 . The method of claim 1 further including an initialization procedure comprising: by a single agent of the plurality of agents: performing the pipelined gradient analysis to update its respective local model of the neural network using a respective subset of data from the common pool of training data; and updating the common global model of the neural network based upon its local model. 6 . The method of claim 1 wherein the common global model is owned by a single agent of the plurality of agents at any one time according to a locking mechanism. 7 . The method of claim 6 wherein the common global model is updated by the single agent during a period in which the single agent owns the common global model. 8 . The method of claim 1 wherein a critical section is reached when an agent of the plurality is ready to update the global model and the agent of the plurality that is ready to update the global model does not own the global model. 9 . The method of claim 8 wherein the agent that is ready to update the global model requests the global model. 10 . A computer system for training a neural network, the computer system comprising: a processor; and a memory with computer code instructions stored thereon, the processor and the memory, with the computer code instructions being configured to cause the system to: by each agent of a plurality of agents, perform a pipelined gradient analysis to update respective local models of a neural network using respective subsets of data from a common pool of training data; and update a common global model of the neural network based upon the local models. 11 . The computer system of claim 10 , wherein, in performing the pipelined gradient analysis, the processor and the memory, with the computer code instructions, are further configured to cause the system to: split the respective local models of the neural network into consecutive chunks; and assign each chunk to a stage of a pipeline. 12 . The computer system of claim 10 wherein each stage of the pipeline is associated with a graphics processing unit (GPU). 13 . The computer system of claim 10 , wherein, in performing the pipelined gradient analysis, the processor and the memory, with the computer code instructions, are further configured to cause the system to: select the subsets of data from the common pool of training data according to a focused-attention back-propagation (FABP) strategy. 14 . The computer system of claim 10 , wherein the processor and the memory, with the computer code instructions, are further configured to implement an initialization procedure that causes the system to: by a single agent of the plurality of agents: perform the pipelined gradient analysis to update its respective local model of the neural network using a respective subset of data from the common pool of training data; and update the common global model of the neural network based upon its local model. 15 . The computer system of claim 10 wherein the common global model is owned by a single agent of the plurality of agents at any one time according to a locking mechanism. 16 . The computer system of claim 15 wherein the common global model is updated by the single agent during a period in which the single agent owns the common global model. 17 . The computer system of claim 10 wherein a critical section is reached when an agent of the plurality is ready to update the global model and the agent of the plurality that is ready to update the global model does not own the global model. 18 . The computer system of claim 17 wherein the agent that is ready to update the global model requests the global model. 19 . A computer program product for training a neural network, the computer program product comprising: one or more computer-readable tangible storage devices and program instructions stored on at least one of the one or more storage devices, the program instructions, when loaded and executed by a processor, cause an apparatus associated with the processor to: cause each agent of a plurality of agents to perform a pipelined gradient analysis to update respective local models of a neural network using respective subsets of data from a common pool of training data; and update a common global model of the neural network based upon the local models. 20 . The computer program product of claim 19 wherein the program instruction further cause the apparatus to cause each agent to perform the pipelined gradient analysis by: splitting the respective local models of the neural network into consecutive chunks; and assigning each chunk to a stage of a pipeline.

Assignees

Nuance Communications Inc

Inventors

Classifications

G06N3/045
Combinations of networks · CPC title
G06N3/09
Supervised learning · CPC title
G06N3/0499
Feedforward networks · CPC title
G06N3/098
Distributed learning, e.g. federated learning · CPC title
G06N3/084Primary
Backpropagation, e.g. using gradient descent · CPC title

Patent family

Related publications grouped by family.

View patent family 56888030

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016267380A1 cover?: Training a neural network is a time consuming and computationally expensive task. Embodiments provide efficient methods and systems for neural network training One example embodiment is implemented by a plurality of agents, where each agent performs a pipelined gradient analysis to update respective local models of the neural network using respective subsets of data from a common pool of traini…
Who is the assignee on this patent?: Nuance Communications Inc
What technology area does this patent fall under?: Primary CPC classification G06N3/084. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Sep 15 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).