Augmenting neural networks with external memory

US10650302B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10650302-B2
Application numberUS-201514885086-A
CountryUS
Kind codeB2
Filing dateOct 16, 2015
Priority dateOct 16, 2014
Publication dateMay 12, 2020
Grant dateMay 12, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for augmenting neural networks with an external memory. One of the methods includes providing an output derived from a first portion of a neural network output as a system output; determining one or more sets of writing weights for each of a plurality of locations in an external memory; writing data defined by a third portion of the neural network output to the external memory in accordance with the sets of writing weights; determining one or more sets of reading weights for each of the plurality of locations in the external memory from a fourth portion of the neural network output; reading data from the external memory in accordance with the sets of reading weights; and combining the data read from the external memory with a next system input to generate the next neural network input.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for processing a sequence of system inputs to generate a sequence of system outputs using an augmented neural network system comprising a neural network and an external memory, wherein the neural network is configured to receive a sequence of neural network inputs and to process each neural network input to generate a neural network output from the neural network input, wherein the external memory is external to the neural network and is configured to store a respective value vector in each of a plurality of locations in the external memory, and wherein the method comprises, for each neural network output: providing an output derived from a first portion of the neural network output as a system output in the sequence of system outputs; determining one or more sets of writing weights for each of the plurality of locations in the external memory from a second portion of the neural network output, wherein the second portion includes a content-based subportion and a location-based subportion, wherein the content-based subportion is different from the location-based subportion, and wherein determining each of the one or more sets of writing weights comprises: for each of the plurality of locations in the external memory: computing a similarity measure between (i) a key vector derived from the content-based subportion and (ii) a respective value vector stored in the location, and determining a respective content-based writing weight based on the computed similarity measure, and adjusting the content-based writing weights using preceding writing weights assigned to the plurality of locations and a shift vector derived from the location-based subportion to generate the one or more sets of writing weights; writing data defined by a third portion of the neural network output to the external memory in accordance with the sets of writing weights; determining one or more sets of reading weights for each of the plurality of locations in the external memory from a fourth portion of the neural network output; reading data from the external memory in accordance with the sets of reading weights; and combining the data read from the external memory with a next system input in the sequence of system inputs to generate a next neural network input in the sequence of neural network inputs. 2. The method of claim 1 , further comprising, for each of the neural network outputs: determining one or more sets of erasing weights for each of the plurality of locations in the external memory from a fifth portion of the neural network output; and erasing data from the external memory in accordance with the sets of erasing weights. 3. The method of claim 2 , wherein the sets of erasing weights are the same as the sets of writing weights and the second portion is the same as the fifth portion. 4. The method of claim 1 , wherein determining each of the one or more sets of writing weights further comprises: determining a set of location-based writing weights; and adjusting the content-based writing weights using the location-based writing weights to generate the one or more sets of writing weights. 5. The method of claim 1 , wherein determining each of the one or more sets of reading weights comprises: determining a set of content-based reading weights from the fourth portion of the neural network output. 6. The method of claim 5 , wherein determining each of the one or more sets of reading weights further comprises: determining a set of location-based reading weights; and adjusting the content-based reading weights using the location-based reading weights to generate the set of reading weights. 7. The method of claim 1 , wherein reading data from the external memory in accordance with the sets of reading weights comprises, for each set of reading weights: determining a weighted average of values stored in the plurality of locations in the external memory in accordance with the reading weights in the set of reading weights. 8. The method of claim 1 , wherein writing data defined by the third portion of the neural network output to the external memory in accordance with the sets of writing weights comprises, for each of the sets of writing weights: determining, from the third portion of the neural network output, a write vector for the set of writing weights; and writing the write vector to the plurality of locations in accordance with the set of writing weights. 9. One or more non-transitory computer storage media storing instructions that, when executed by one or more computers, cause the one or more computers to perform operations for processing a sequence of system inputs to generate a sequence of system outputs using an augmented neural network system comprising a neural network and an external memory, wherein the neural network is configured to receive a sequence of neural network inputs and to process each neural network input to generate a neural network output from the neural network input, wherein the external memory is external to the neural network and is configured to store a respective value vector in each of a plurality of locations in the external memory, and wherein the operations comprise, for each neural network output: providing an output derived from a first portion of the neural network output as a system output in the sequence of system outputs; determining one or more sets of writing weights for each of the plurality of locations in the external memory from a second portion of the neural network output, wherein the second portion includes a content-based subportion and a location-based subportion, wherein the content-based subportion is different from the location-based subportion, and wherein determining each of the one or more sets of writing weights comprises: for each of the plurality of locations in the external memory: computing a similarity measure between (i) a key vector derived from the content-based subportion and (ii) a respective value vector stored in the location, and determining a respective content-based writing weight based on the computed similarity measure, and adjusting the content-based writing weights using preceding writing weights assigned to the plurality of locations and a shift vector derived from the location-based subportion to generate the one or more sets of writing weights; writing data defined by a third portion of the neural network output to the external memory in accordance with the sets of writing weights; determining one or more sets of reading weights for each of the plurality of locations in the external memory from a fourth portion of the neural network output; reading data from the external memory in accordance with the sets of reading weights; and combining the data read from the external memory with a next system input in the sequence of system inputs to generate a next neural network input in the sequence of neural network inputs. 10. The one or more non-transitory computer storage media of claim 9 , wherein the operations further comprises, for each of the neural network outputs: determining one or more sets of erasing weights for each of the plurality of locations in the external memory from a fifth portion of the neural network output; and erasing data from the external memory in accordance with the sets of erasing weights. 11. A system comprising one or more computers and one or more non-transitory computer storage media storing instructions that, when executed by the one or more computers, cause the one or more computers to perform operations for processing a sequence of system inputs to generate a sequence of system outputs using an augmented neural network system comprising a neural ne

Assignees

Inventors

Classifications

  • G06N3/04Primary

    Architecture, e.g. interconnection topology · CPC title

  • G06N3/08Primary

    Learning methods · CPC title

  • Feedforward networks · CPC title

  • characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title

  • Supervised learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10650302B2 cover?
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for augmenting neural networks with an external memory. One of the methods includes providing an output derived from a first portion of a neural network output as a system output; determining one or more sets of writing weights for each of a plurality of locations in an external memory; writing data …
Who is the assignee on this patent?
Deepmind Tech Ltd
What technology area does this patent fall under?
Primary CPC classification G06N3/04. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 12 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).