Differentiable neuromodulated plasticity for reinforcement learning and supervised learning tasks

US2020334530A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2020334530-A1
Application numberUS-202016850011-A
CountryUS
Kind codeA1
Filing dateApr 16, 2020
Priority dateApr 19, 2019
Publication dateOct 22, 2020
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system uses neural networks for applications such as navigation of autonomous vehicles or mobile robots. The system uses a trained neural network model that comprises fixed parameters that remain unchanged during execution of the model, plastic parameters that are modified during execution of the model, and nodes that generate outputs based on the inputs, fixed parameters, and the plastic parameters. The system provides input data to the neural network model and executes the neural network model. The system updates the plastic parameters of the neural network model by adjusting the rate at which the plastic parameters update over time based on at least one output of a node.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer-implemented method comprising: receiving sensor data from one or more sensors mounted on a moveable apparatus, the sensor data describing the environment of the moveable apparatus; loading a trained neural network model, the neural network model comprising: a plurality of fixed parameters, wherein a fixed parameter remains unchanged during execution of the trained neural network, a plurality of plastic parameters, wherein a plastic parameter is modified during execution of the trained neural network model, a plurality of nodes, each node of the plurality of nodes generating an output based on one or more inputs to the neural network model, the plurality of fixed parameters, and the plurality of plastic parameters, wherein at least one node of the plurality of nodes generates an output further based on at least one weighted output generated by one or more other nodes of the plurality of nodes, encoding sensor data to generate input data for the neural network model; providing the input data comprising the encoded sensor data to the neural network model; executing the trained neural network model to generate output results, based on the input data comprising the encoded sensor data, the output results describing the environment of the moveable apparatus; updating the plastic parameters of the neural network model, the updating comprising: adjusting a rate at which the plastic parameters update over time based on at least one output of a node of the plurality of nodes generated by the executing the trained neural network model; and generating signals for controlling the moveable apparatus based on the output results. 2 . The computer-implemented method of claim 1 , wherein the moveable apparatus is an autonomous vehicle, and wherein the generated signals include navigation instructions for the autonomous vehicle. 3 . The computer-implemented method of claim 1 , wherein the moveable apparatus is a robot configured to navigate through an obstacle course, wherein the generated signals control the motion of the robot. 4 . The method of any of claim 1 , wherein the sensor data comprises images captured by a camera mounted on the moveable apparatus. 5 . The computer-implemented method of claim 1 , wherein the sensor data comprises lidar scans captured by a lidar mounted on the moveable apparatus. 6 . The computer-implemented method claim 1 , wherein the updating the plastic parameters further comprises: adjusting the rate at which the plastic parameters update over time based on past executions of the trained neural network model. 7 . The computer-implemented method of claim 6 , wherein the past executions of the trained neural network model are weighted based on a trainable decay factor. 8 . The computer-implemented method of claim 1 , wherein the input data comprises a reward input, and the at least one of the generated output results from executing the trained neural network model comprises a reward signal generated in response to the reward input being above a threshold value. 9 . The computer-implemented method of claim 1 , wherein the trained neural network model is one selected from a group comprising: a long short-term memory (LSTM) model, a recurrent neural network (RNN), and a feedforward neural network. 10 . The computer-implemented method of claim 1 , wherein the plastic parameters are optimized using gradient descent at an execution time of the trained neural network model. 11 . A computer-implemented method comprising: loading a trained neural network model, the neural network model comprising: a plurality of fixed parameters, wherein a fixed parameter remains unchanged during execution of the trained neural network, a plurality of plastic parameters, wherein a plastic parameter is modified during execution of the trained neural network model, a plurality of nodes, each of the plurality of nodes generating an output based on the one or more inputs, the plurality of fixed parameters, and the plurality of plastic parameters, wherein at least one node of the plurality of nodes generates an output further based on at least one weighted output generated by one or more other nodes of the plurality of nodes; providing an input data to the neural network model; executing the trained neural network to generate output results, the output results corresponding to at least one of: a recognized pattern in the input data, a decision based on the input data, or a prediction based on the input data; and updating the plastic parameters of the neural network model, the updating comprising: adjusting the rate at which the plastic parameters update over time based on at least one output of a node of the plurality of nodes, the output generated by executing the trained neural network. 12 . The computer-implemented method of claim 11 , wherein the updating the plastic parameters further comprises: adjusting the rate at which the plastic parameters update over time based on past executions of the trained neural network model. 13 . The computer-implemented method of claim 12 , wherein the past executions of the trained neural network model are weighted based on a trainable decay factor. 14 . The computer-implemented method of method of claim 11 , wherein the input data comprises a reward input, and the at least one of the generated output results from executing the trained neural network model comprises a reward signal generated in response to the reward input being above a threshold value. 15 . The computer-implemented method of claim 11 , wherein the plastic parameters are optimized using gradient descent at an execution time of the trained neural network model. 16 . The computer-implemented method of claim 11 , wherein the input data comprises an image, and wherein the generated output results comprise a recognized object in the image. 17 . The computer-implemented method of claim 11 , wherein the input data comprises a sentence in a language, and wherein the generated output results comprise a sentence in another language. 18 . A non-transitory computer readable storage medium storing executable instructions that, when executed by one or more processors, cause the one or more processors to execute steps comprising: receiving sensor data from one or more sensors mounted on a moveable apparatus, the sensor data describing the environment of the moveable apparatus; loading a trained neural network model, the neural network model comprising: a plurality of fixed parameters, wherein a fixed parameter remains unchanged during execution of the trained neural network, a plurality of plastic parameters, wherein a plastic parameter is modified during execution of the trained neural network model, a plurality of nodes, each node of the plurality of nodes generating an output based on one or more inputs to the neural network model, the plurality of fixed parameters, and the plurality of plastic parameters, wherein at least one node of the plurality of nodes generates an output further based on at least one weighted output generated by one or more other nodes of the plurality of nodes, encoding sensor data to generate input data for the neural network model; providing the input data comprising the encoded sensor data to the neural network model; executing the trained neural network model to generate output results, based on the input data comprising the encoded sensor data, the output results describing the environment of the moveable apparatus; updating the

Assignees

Inventors

Classifications

  • Recurrent networks, e.g. Hopfield networks · CPC title

  • Probabilistic or stochastic networks · CPC title

  • characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Reinforcement learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2020334530A1 cover?
A system uses neural networks for applications such as navigation of autonomous vehicles or mobile robots. The system uses a trained neural network model that comprises fixed parameters that remain unchanged during execution of the model, plastic parameters that are modified during execution of the model, and nodes that generate outputs based on the inputs, fixed parameters, and the plastic par…
Who is the assignee on this patent?
Uber Technologies Inc
What technology area does this patent fall under?
Primary CPC classification G10L17/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Oct 22 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).