Interleaver design and pairwise codeword distance distribution enhancement for turbo autoencoder
US-12175353-B2 · Dec 24, 2024 · US
US2017200076A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2017200076-A1 |
| Application number | US-201715406557-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jan 13, 2017 |
| Priority date | Jan 13, 2016 |
| Publication date | Jul 13, 2017 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In one aspect, this specification describes a recurrent neural network system implemented by one or more computers that is configured to process input sets to generate neural network outputs for each input set. The input set can be a collection of multiple inputs for which the recurrent neural network should generate the same neural network output regardless of the order in which the inputs are arranged in the collection. The recurrent neural network system can include a read neural network, a process neural network, and a write neural network. In another aspect, this specification describes a system implemented as computer programs on one or more computers in one or more locations that is configured to train a recurrent neural network that receives a neural network input and sequentially emits outputs to generate an output sequence for the neural network input.
Opening claim text (preview).
What is claimed is: 1 . A neural network system implemented by one or more computers, the neural network system comprising: a read neural network configured to: receive an input set comprising a plurality of inputs, and process each input in the input set to generate a respective memory vector for each input; a process neural network configured to: process the respective memory vector for each of the inputs to generate an order-invariant numeric embedding for the input set, wherein the order-invariant numeric embedding is permutation invariant to the inputs in the input set; and a write neural network configured to: process the order-invariant numeric embedding to generate a neural network output for the input set. 2 . The neural network system of claim 1 , wherein the process neural network comprises: a long short-term memory (LSTM) neural network configured to, for each of a plurality of time steps, update a current modified internal state to generate an initial updated internal state; and a subsystem configured to, for each of the plurality of time steps: receive the initial updated internal state for the time step, and apply an attention mechanism over the memory vectors for the inputs to modify the initial updated internal state for the time step to generate a modified internal state for the time step. 3 . The neural network system of claim 2 , wherein the modified internal state for the last time step in the plurality of time steps is the order-invariant numeric embedding for the input set. 4 . The neural network system of claim 2 , wherein applying the attention mechanism comprises: determining a respective similarity value for each of the memory vectors, wherein the respective similarity value represents a similarity between the initial updated internal state and the memory vector; generating a respective attention weight for each of the memory vectors from the respective similarity values; generating a read vector by combining the memory vectors in accordance with the attention weights; and combining the initial updated internal state and the read vector to generate the modified internal state. 5 . The neural network system of claim 4 , wherein determining the respective similarity for each of the memory vectors comprises determining a dot product between the initial updated internal state and the memory vector. 6 . The neural network system of claim 1 , wherein the write neural network is a pointer recurrent neural network configured to process the order-invariant numeric embedding to generate a plurality of pointers to the inputs in the input set. 7 . The neural network system of claim 1 , wherein the write neural network is a recurrent neural network configured to process the order-invariant numeric embedding to generate a sequence of neural network outputs. 8 . A method of training a recurrent neural network having a plurality of parameters that receives a neural network input and sequentially emits outputs to generate an output sequence for the neural network input, the method comprising: receiving first training data for training the recurrent neural network, the first training data comprising a plurality of training example pairs, each training example pair comprising a training input and a target output set for the training input, the training output set having a plurality of target outputs; and training the recurrent neural network on each of the training example pairs in the first training data, wherein training the recurrent neural network comprises, for each training example pair: selecting a particular order for the target outputs from the target output set in the training example pair; and training the recurrent neural network to generate an output sequence for the training input in the training example pair that matches a sequence having the target outputs from the target output set arranged according to the particular order. 9 . The method of claim 8 , further comprising: pre-training the recurrent neural network on second training data to determine pre-trained values of the parameters of the recurrent neural network from initial values of the parameters of the recurrent neural network, wherein training the recurrent neural network comprises determining trained values of the parameters of the recurrent neural network from the pre-trained values of the parameters. 10 . The method of claim 9 , wherein pre-training the recurrent neural network comprises, for each training example pair in the second training data: generating a plurality of candidate target sequences, each candidate target sequence having the target outputs from the target output set in the training example pair arranged according to a different order; and training the recurrent neural network to maximize an aggregate likelihood that one of the plurality of candidate target sequences is the correct target sequence for the training input in the training example pair as determined by the recurrent neural network. 11 . The method of claim 10 , wherein selecting the particular order comprises: generating a plurality of candidate target sequences, each candidate target sequence having the target outputs from the target output set arranged according to a different order; determining a respective likelihood for each of the candidate target sequences, the respective likelihood for each of the candidate target sequences being the likelihood that the candidate target sequence is the correct target sequence for the training input as determined by the recurrent neural network in accordance with current values of the parameters of the recurrent neural network; and selecting as the particular order the order according to which the target outputs in one of the candidate target sequences are arranged based on the respective likelihoods. 12 . The method of claim 11 , wherein selecting as the particular order the order according to which the target outputs in one of the candidate target sequences are arranged based on the respective likelihoods comprises: selecting the order according to which the target outputs in the candidate target sequence having the highest likelihood are arranged. 13 . The method of claim 11 , wherein selecting as the particular order the order according to which the target outputs in one of the candidate target sequences are arranged based on the respective likelihoods comprises: sampling a candidate target sequence from the candidate target sequences in accordance with the respective likelihoods; and selecting the order according to which the target outputs in the sampled candidate target sequence are arranged. 14 . The method of claim 11 , wherein the likelihood is a log likelihood. 15 . The method of claim 11 , wherein generating the plurality of candidate sequence comprises generating a respective candidate sequence for each possible ordering of the target outputs. 16 . The method of claim 11 , wherein generating the plurality of candidate sequence comprises performing an inexact search over possible orderings of the target outputs. 17 . A system comprising one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising: receiving first training data for training the recurrent neural network, the first training data comprising a plurality of training example pairs, each training example pair comprising a training input and a target output set for the training input, t
Combinations of networks · CPC title
Recurrent networks, e.g. Hopfield networks · CPC title
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
Supervised learning · CPC title
Feedforward networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.