Log-based transaction constraint management
US-10282228-B2 · May 7, 2019 · US
US10970628B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10970628-B2 |
| Application number | US-201615347618-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 9, 2016 |
| Priority date | Nov 9, 2015 |
| Publication date | Apr 6, 2021 |
| Grant date | Apr 6, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and Methods for training a neural network represented as a computational graph are disclosed. An example method begins with obtaining data representing a computational graph. The computational graph is then augmented to generate a training computational graph for training the neural network using a machine learning training algorithm that includes computing a gradient of an objective function with respect to each of the parameters of the neural network. Augmenting the computational graph includes inserting a plurality of gradient nodes and training edges into the computational graph to generate a backward path through the computational graph that represents operations for computing the gradients of the objective function with respect to the parameters of the neural network. The neural network is trained using the machine learning training algorithm by executing the training computational graph.
Opening claim text (preview).
What is claimed is: 1. A method for training a neural network represented as a computational graph, wherein the computational graph comprises a plurality of nodes, a plurality of connector directed edges, and a plurality of parameter directed edges, wherein each node represents a respective operation performed by the neural network as part of determining a neural network output from a neural network input, wherein each connector directed edge connects a respective first node to a respective second node that represents an operation that receives, as input, an output of an operation represented by the respective first node, and wherein each parameter directed edge connects into a respective node and represents a flow of one or more parameters of the neural network as input to the operation represented by the respective node, the method comprising: obtaining data representing the computational graph; augmenting the computational graph to generate a training computational graph for training the neural network using a machine learning training algorithm that includes computing a gradient of an objective function with respect to each of the parameters of the neural network, comprising: identifying one or more control flow nodes in the computational graph, wherein each control flow node represents a control flow operation that causes operations represented by one or more other nodes in the computational graph to be conditionally or iteratively performed during execution of the computational graph, wherein the one or more control flow nodes comprise (i) an enter control flow node representing an enter operation and (ii) a switch control flow node representing a switch operation, inserting a plurality of gradient nodes and training edges into the computational graph to generate a backward path through which gradient data flows, the backward path representing operations for computing the gradients of the objective function with respect to the parameters of the neural network, wherein each gradient node represents a gradient function that computes a gradient of the objective function with respect to parameters flowing along a respective parameter directed edge in the computational graph, inserting, for each identified control flow node, a corresponding backward path control flow node along the backward path through the computational graph, the corresponding backward path control flow node representing an operation for adjusting how the gradient data flows through the gradient nodes along the backward path to account for a respective operation represented by the identified control flow node, wherein a corresponding backward path control flow node for the enter control flow node is an exit control flow node representing an exit operation, wherein a corresponding backward path control flow node for the switch control flow node is a merge backward control flow node representing a merge operation along the backward path; and training the neural network using the machine learning training algorithm by executing the training computational graph. 2. The method of claim 1 , wherein the one or more control flow nodes include a merge control flow node representing a merge operation, and wherein a backward path control flow node corresponding to the merge control flow node is a switch backward control flow node representing a switch operation along the backward path. 3. The method of claim 1 , wherein the one or more control flow nodes include an exit control flow node representing an exit operation, and wherein a backward path control flow node corresponding to the exit control flow node is an enter backward control flow node representing an enter operation along the backward path. 4. The method of claim 1 , wherein the one or more control flow nodes include an iteration counter control flow node representing an iteration counter operation, and wherein a backward path control flow node corresponding to the iteration counter control flow node is an iteration counter backward control flow node representing an iteration counter operation along the backward path. 5. The method of claim 1 , wherein augmenting the computational graph further comprises: determining that multiple iterations of one or more particular operations represented by one or more particular nodes in the computational graph are performed during execution of the computational graph; and inserting one or more monitoring nodes into the computational graph, the monitoring nodes representing operations that, during the execution of the training computational graph, monitor a number of iterations of the particular operations that are performed, and for each performed iteration of each of the particular operations, store the output of the particular operation represented by the node during the iteration. 6. The method of claim 5 , wherein, during execution of the backward path in the training computational graph, the outputs stored by the one or more monitoring nodes are provided as input to the gradient functions represented by one or more of the gradient nodes. 7. The method of claim 5 , wherein determining that multiple iterations of one or more particular operations represented by one or more particular nodes in the computational graph are performed during execution of the computational graph comprises: analyzing the computational graph to identify one or more control flow nodes in the computational graph that cause the particular operations represented by the one or more particular nodes in the computational graph to be performed multiple times. 8. The method of claim 5 , wherein the neural network is a recurrent neural network that receives a respective neural network input at each of a plurality of time steps and generates a respective neural network at each of the plurality of time steps, wherein the operations represented by each of the particular nodes generate a respective node output for each of the plurality of time steps, and wherein the monitoring nodes store the respective node outputs for each of the plurality of time steps. 9. The method of claim 5 , wherein storing the output of the particular operation represented by the node during the iteration includes: asynchronously sending the data from a device on which it was produced to a central processing unit for storage after the data was produced; and asynchronously retrieving the data from the central processing unit for use on the device in the backward path through the computational graph that represents operations for computing the gradients of the objective function with respect to the parameters of the neural network. 10. The method of claim 1 , wherein training the neural network using the machine learning training algorithm by executing the training computational graph comprises: allocating the nodes in the training computational graph across a plurality of devices; and causing each of the devices to perform the operations represented by the nodes allocated to the device. 11. A system for training a neural network represented as a computational graph, wherein the computational graph comprises a plurality of nodes, a plurality of connector directed edges, and a plurality of parameter directed edges, wherein each node represents a respective operation performed by the neural network as part of determining a neural network output from a neural network input, wherein each connector directed edge connects a respective first node to a respective second node that represents an operation that receives, as input, an output of an operation represented by the respective first node, and wherein each parameter directed edge connects into a respective node and represents a flow of one or more paramete
Combinations of networks · CPC title
Recurrent networks, e.g. Hopfield networks · CPC title
Learning methods · CPC title
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
Supervised learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.