What technology area does this patent fall under?

Primary CPC classification G06N3/08. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Apr 06 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Training neural networks represented as computational graphs

US10970628B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10970628-B2
Application number	US-201615347618-A
Country	US
Kind code	B2
Filing date	Nov 9, 2016
Priority date	Nov 9, 2015
Publication date	Apr 6, 2021
Grant date	Apr 6, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and Methods for training a neural network represented as a computational graph are disclosed. An example method begins with obtaining data representing a computational graph. The computational graph is then augmented to generate a training computational graph for training the neural network using a machine learning training algorithm that includes computing a gradient of an objective function with respect to each of the parameters of the neural network. Augmenting the computational graph includes inserting a plurality of gradient nodes and training edges into the computational graph to generate a backward path through the computational graph that represents operations for computing the gradients of the objective function with respect to the parameters of the neural network. The neural network is trained using the machine learning training algorithm by executing the training computational graph.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for training a neural network represented as a computational graph, wherein the computational graph comprises a plurality of nodes, a plurality of connector directed edges, and a plurality of parameter directed edges, wherein each node represents a respective operation performed by the neural network as part of determining a neural network output from a neural network input, wherein each connector directed edge connects a respective first node to a respective second node that represents an operation that receives, as input, an output of an operation represented by the respective first node, and wherein each parameter directed edge connects into a respective node and represents a flow of one or more parameters of the neural network as input to the operation represented by the respective node, the method comprising: obtaining data representing the computational graph; augmenting the computational graph to generate a training computational graph for training the neural network using a machine learning training algorithm that includes computing a gradient of an objective function with respect to each of the parameters of the neural network, comprising: identifying one or more control flow nodes in the computational graph, wherein each control flow node represents a control flow operation that causes operations represented by one or more other nodes in the computational graph to be conditionally or iteratively performed during execution of the computational graph, wherein the one or more control flow nodes comprise (i) an enter control flow node representing an enter operation and (ii) a switch control flow node representing a switch operation, inserting a plurality of gradient nodes and training edges into the computational graph to generate a backward path through which gradient data flows, the backward path representing operations for computing the gradients of the objective function with respect to the parameters of the neural network, wherein each gradient node represents a gradient function that computes a gradient of the objective function with respect to parameters flowing along a respective parameter directed edge in the computational graph, inserting, for each identified control flow node, a corresponding backward path control flow node along the backward path through the computational graph, the corresponding backward path control flow node representing an operation for adjusting how the gradient data flows through the gradient nodes along the backward path to account for a respective operation represented by the identified control flow node, wherein a corresponding backward path control flow node for the enter control flow node is an exit control flow node representing an exit operation, wherein a corresponding backward path control flow node for the switch control flow node is a merge backward control flow node representing a merge operation along the backward path; and training the neural network using the machine learning training algorithm by executing the training computational graph. 2. The method of claim 1 , wherein the one or more control flow nodes include a merge control flow node representing a merge operation, and wherein a backward path control flow node corresponding to the merge control flow node is a switch backward control flow node representing a switch operation along the backward path. 3. The method of claim 1 , wherein the one or more control flow nodes include an exit control flow node representing an exit operation, and wherein a backward path control flow node corresponding to the exit control flow node is an enter backward control flow node representing an enter operation along the backward path. 4. The method of claim 1 , wherein the one or more control flow nodes include an iteration counter control flow node representing an iteration counter operation, and wherein a backward path control flow node corresponding to the iteration counter control flow node is an iteration counter backward control flow node representing an iteration counter operation along the backward path. 5. The method of claim 1 , wherein augmenting the computational graph further comprises: determining that multiple iterations of one or more particular operations represented by one or more particular nodes in the computational graph are performed during execution of the computational graph; and inserting one or more monitoring nodes into the computational graph, the monitoring nodes representing operations that, during the execution of the training computational graph, monitor a number of iterations of the particular operations that are performed, and for each performed iteration of each of the particular operations, store the output of the particular operation represented by the node during the iteration. 6. The method of claim 5 , wherein, during execution of the backward path in the training computational graph, the outputs stored by the one or more monitoring nodes are provided as input to the gradient functions represented by one or more of the gradient nodes. 7. The method of claim 5 , wherein determining that multiple iterations of one or more particular operations represented by one or more particular nodes in the computational graph are performed during execution of the computational graph comprises: analyzing the computational graph to identify one or more control flow nodes in the computational graph that cause the particular operations represented by the one or more particular nodes in the computational graph to be performed multiple times. 8. The method of claim 5 , wherein the neural network is a recurrent neural network that receives a respective neural network input at each of a plurality of time steps and generates a respective neural network at each of the plurality of time steps, wherein the operations represented by each of the particular nodes generate a respective node output for each of the plurality of time steps, and wherein the monitoring nodes store the respective node outputs for each of the plurality of time steps. 9. The method of claim 5 , wherein storing the output of the particular operation represented by the node during the iteration includes: asynchronously sending the data from a device on which it was produced to a central processing unit for storage after the data was produced; and asynchronously retrieving the data from the central processing unit for use on the device in the backward path through the computational graph that represents operations for computing the gradients of the objective function with respect to the parameters of the neural network. 10. The method of claim 1 , wherein training the neural network using the machine learning training algorithm by executing the training computational graph comprises: allocating the nodes in the training computational graph across a plurality of devices; and causing each of the devices to perform the operations represented by the nodes allocated to the device. 11. A system for training a neural network represented as a computational graph, wherein the computational graph comprises a plurality of nodes, a plurality of connector directed edges, and a plurality of parameter directed edges, wherein each node represents a respective operation performed by the neural network as part of determining a neural network output from a neural network input, wherein each connector directed edge connects a respective first node to a respective second node that represents an operation that receives, as input, an output of an operation represented by the respective first node, and wherein each parameter directed edge connects into a respective node and represents a flow of one or more paramete

Assignees

Google Llc

Inventors

Classifications

G06N3/045
Combinations of networks · CPC title
G06N3/044
Recurrent networks, e.g. Hopfield networks · CPC title
G06N3/08Primary
Learning methods · CPC title
G06N3/0442
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
G06N3/09
Supervised learning · CPC title

Patent family

Related publications grouped by family.

View patent family 57910102

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10970628B2 cover?: Systems and Methods for training a neural network represented as a computational graph are disclosed. An example method begins with obtaining data representing a computational graph. The computational graph is then augmented to generate a training computational graph for training the neural network using a machine learning training algorithm that includes computing a gradient of an objective fu…
Who is the assignee on this patent?: Google Llc
What technology area does this patent fall under?: Primary CPC classification G06N3/08. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Apr 06 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Log-based transaction constraint management

Recurrent Neural Networks for Malware Analysis

Accelerated tr-l-bfgs algorithm for neural network

Iteration support in a heterogeneous dataflow engine

Method of training a neural network

Iteration support in a heterogeneous dataflow engine

Frequently asked questions