Who is the assignee on this patent?

Microsoft Technology Licensing Llc

What technology area does this patent fall under?

Primary CPC classification G06N3/08. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Jul 06 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Neural network training performance optimization framework

US2017193361A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2017193361-A1
Application number	US-201514986186-A
Country	US
Kind code	A1
Filing date	Dec 31, 2015
Priority date	Dec 31, 2015
Publication date	Jul 6, 2017
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A neural network training tool selects from a plurality of parallelizing techniques and selects from a plurality of forward-propagation computation techniques. The neural network training tool performs a forward-propagation phase to train a neural network using the selected parallelizing technique and the selected forward-propagation computation technique based on one or more inputs. Additionally, the neural network training tool selects from a plurality computation techniques and from a plurality of parallelizing techniques for a backward-propagation phase. The neural network training tool performs a backward-propagation phase of training the neural network using the selected backward-propagation parallelizing technique and the selected backward-propagation computation technique to generate error gradients and weight deltas and to update weights associated with one or more layers of the neural network.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method comprising: receiving one or more inputs for training a neural network; selecting a parallelizing technique from a plurality of parallelizing techniques; selecting a forward-propagation computation technique from a plurality of computation techniques; directing the neural network to process the one or more inputs using the selected parallelizing technique and the selected computation technique; and receiving from the neural network, one or more outputs resulting from the neural network processing the one or more inputs. 2 . A method as recited in claim 1 , wherein the plurality of parallelizing techniques include: parallel processing; and processing in parallel. 3 . A method as recited in claim 1 , wherein the plurality of computation techniques include: matrix multiplication; and stencil-based computation. 4 . A method as recited in claim 1 , wherein selecting a parallelizing technique from the plurality of parallelizing techniques is based, at least in part, on properties associated with the neural network. 5 . A method as recited in claim 4 , wherein the properties associated with the neural network comprise one or more of: a number of layers within the neural network; a number of feature maps associated with individual layers of the neural network; a data sparsity associated with individual layers of the neural network; a size associated with a convolution filter used to process the inputs; or a stride size. 6 . A method as recited in claim 1 , wherein selecting a computation technique from the plurality of computation techniques is based, at least in part, on properties associated with the neural network. 7 . A method as recited in claim 6 , wherein the properties associated with the neural network comprise one or more of: a size of the inputs; a number of inputs; a number of feature maps of the inputs; a stride size; or a size associated with a convolution filter that is used to process the inputs. 8 . A method as recited in claim 1 , wherein: the neural network includes at least a first layer and a second layer; selecting the parallelizing technique comprises: selecting a first parallelizing technique from the plurality of parallelizing techniques to use for the first layer; and selecting a second parallelizing technique from the plurality of parallelizing techniques to use for the second layer; and selecting the computation technique comprises: selecting a first computation technique from the plurality of computation techniques to use for the first layer; and selecting a second computation technique from the plurality of computation techniques to use for the second layer. 9 . A method as recited in claim 1 , further comprising: determining, based at least in part on the one or more inputs and the one or more outputs, one or more output activation errors; selecting a backward-propagation computation technique from a plurality of backward-propagation computation techniques; and processing the neural network based, at least in part, on the one or more output activation errors, using the selected backward-propagation technique. 10 . A method as recited in claim 9 , wherein the plurality of backward-propagation computation techniques include: matrix multiplication; and sparse-dense matrix computation. 11 . A method as recited in claim 9 , wherein processing the neural network based, at least in part, on the one or more output activation errors, includes updating weights associated with one or more layers of the neural network. 12 . A method as recited in claim 9 , further comprising: selecting a backward-propagation parallelization technique from a plurality of backward-propagation parallelization techniques, wherein processing the neural network based, at least in part, on the one or more output activation errors, using the selected backward-propagation technique, further includes processing the neural network based on the selected backward-propagation parallelization technique. 13 . A device comprising: a processor; and a computer-readable medium communicatively coupled to the processor; a parallelizing decision module stored on the computer-readable medium and executable by the processor to select, based at least in part on properties of a neural network, a parallelizing technique from a plurality of parallelizing techniques; a forward propagation decision module stored on the computer-readable medium and executable by the processor to select, based at least in part on properties of the neural network, a computation technique from a plurality of computation techniques; and a forward-propagation processing module configured to: receive one or more inputs for training the neural network; cause the neural network to process, based at least in part on the selected parallelizing technique and the selected computation technique, the one or more inputs; and receive, from the neural network, one or more outputs resulting from the neural network processing the one or more inputs. 14 . A device as recited in claim 13 , wherein: the plurality of parallelizing techniques include: parallel processing; and processing in parallel; and the plurality of computation techniques include: matrix multiplication; and stencil-based computation. 15 . A device as recited in claim 13 , further comprising a backward-propagation decision module stored on the computer-readable media and executable by the processor to: determine, based at least in part on the one or more inputs and the one or more outputs, one or more output activation errors for the neural network; select, based at least in part on properties of the neural network, a backward-propagation technique from a plurality of backward-propagation techniques and a parallelizing technique from a plurality of parallelizing techniques; and process the neural network using the selected backward-propagation technique and the selected parallelizing technique to update weights associated with one or more layers of the neural network. 16 . One or more computer-readable media storing computer-executable instructions that, when executed on one or more processors, configure a computer to train a neural network by performing acts comprising: causing the neural network to process one or more inputs; receiving from the neural network, one or more outputs resulting from the neural network processing the one or more inputs; determining, based at least in part on the one or more inputs and the one or more outputs, one or more output activation errors for the neural network; selecting, based at least in part on one or more properties associated with the neural network, a backward-propagation technique from a plurality of backward-propagation techniques; using the selected backward-propagation technique and the one or more output activation errors to calculate error gradients and weight deltas for the neural network; and updating weights associated with one or more layers of the neural network based, at least in part, on the error gradients or the weight deltas. 17 . One or more computer-readable media as recited in claim 16 , wherein: the selected backward-propagation technique is a sparse-dense matrix multiplication technique; and using the selected backward-propagation technique and the one or more output activation errors to generate input activation errors and weight deltas for the neural network includes: generating one or more sparse matrices using the one or more output activation errors

Assignees

Microsoft Technology Licensing Llc

Inventors

Classifications

G06N3/08Primary
Learning methods · CPC title
G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title
G06N3/0495
Quantised networks; Sparse networks; Compressed networks · CPC title
G06N3/09
Supervised learning · CPC title
G06N3/063Primary
using electronic means · CPC title

Patent family

Related publications grouped by family.

View patent family 57758832

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2017193361A1 cover?: A neural network training tool selects from a plurality of parallelizing techniques and selects from a plurality of forward-propagation computation techniques. The neural network training tool performs a forward-propagation phase to train a neural network using the selected parallelizing technique and the selected forward-propagation computation technique based on one or more inputs. Additional…
Who is the assignee on this patent?: Microsoft Technology Licensing Llc
What technology area does this patent fall under?: Primary CPC classification G06N3/08. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Jul 06 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).