Methods and apparatus for reinforcement learning
US-2015100530-A1 · Apr 9, 2015 · US
US9401148B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9401148-B2 |
| Application number | US-201414228469-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 28, 2014 |
| Priority date | Nov 4, 2013 |
| Publication date | Jul 26, 2016 |
| Grant date | Jul 26, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for inputting speech data that corresponds to a particular utterance to a neural network; determining an evaluation vector based on output at a hidden layer of the neural network; comparing the evaluation vector with a reference vector that corresponds to a past utterance of a particular speaker; and based on comparing the evaluation vector and the reference vector, determining whether the particular utterance was likely spoken by the particular speaker.
Opening claim text (preview).
The invention claimed is: 1. A method comprising: inputting, by a computing device, speech data that corresponds to a particular utterance of a particular speaker to a neural network having parameters trained based on propagation between an input layer and an output layer through one or more hidden layers located between the input layer and the output layer, wherein the one or more hidden layers were trained using utterances of multiple speakers, and wherein the multiple speakers do not include the particular speaker; generating, by the computing device and in response to inputting the speech data that corresponds to the particular utterance to the neural network, a representation of activations occurring at a particular layer of the neural network that was trained as one of the hidden layers located between the input layer and the output layer; comparing, by the computing device, the generated representation of activations occurring at the particular layer of the neural network in response to the speech data that corresponds to the particular utterance with a reference representation of activations occurring at the particular layer of the neural network in response to speech data that corresponds to one or more past utterances of the particular speaker; based on comparing the generated representation and the reference representation, determining, by the computing device, that the particular utterance was likely spoken by the particular speaker; and providing, by the computing device, access to the computing device based on determining that the particular utterance was likely spoken by the particular speaker. 2. The method of claim 1 , wherein comparing, by the computing device, the generated representation with the reference representation comprises determining, by the computing device, a distance between the generated representation and the reference representation, and wherein determining, by the computing device, that the particular utterance was spoken by the particular speaker comprises determining, by the computing device, that the distance between the generated representation and the reference representation satisfies a threshold. 3. The method of claim 2 , wherein determining, by the computing device, a distance between the generated representation and the reference representation comprises computing, by the computing device, a cosine distance between the generated representation and the reference representation. 4. The method of claim 1 , wherein generating, by the computing device and in response to inputting the speech data that corresponds to the particular utterance to the neural network, the representation of activations occurring at the particular layer of the neural network that was trained as one of the hidden layers located between the input layer and the output layer comprises generating, by the computing device and in response to inputting the speech data that corresponds to the particular utterance to the neural network, a representation of activations occurring at a particular layer of the neural network that was trained as one of the hidden layers located adjacent to the output layer. 5. The method of claim 1 , wherein generating, by the computing device and in response to inputting the speech data that corresponds to the particular utterance to the neural network, the representation of activations occurring at the particular layer of the neural network that was trained as one of the hidden layers located between the input layer and the output layer comprises generating, by the computing device and in response to inputting the speech data that corresponds to the particular utterance to the neural network, the representation of activations occurring at a particular layer of the neural network that was trained as a predetermined one of the hidden layers located between the input layer and the output layer. 6. The method of claim 1 , comprising: obtaining, by the computing device access to the neural network; for each of multiple utterances of the particular speaker: inputting, by the computing device, speech data corresponding to the respective utterance to the neural network; and generating, by the computing device, a representation of activations occurring at the particular layer of the neural network in response to the speech data corresponding to the respective utterance; combining, by the computing device, the generated representations of activations occurring at the particular layer of the neural network in response to speech data corresponding to each of the multiple utterances of the particular speaker; and using, by the computing device, the combination of generated representations of activations occurring at the particular layer of the neural network in response to speech data corresponding to each of the multiple utterances of the particular speaker as the reference representation. 7. The method of claim 1 , further comprising dividing, by the computing device, the speech data corresponding to the particular utterance into frames; and wherein generating, by the computing device and in response to inputting the speech data that corresponds to the particular utterance to the neural network, the representation of activations occurring at the particular layer of the neural network comprises: determining, by the computing device and for each of multiple different frames of the speech data, a corresponding set of activations occurring at the particular layer of the neural network based on the frame; and generating, by the computing device, the representation of the activations occurring at the particular layer by averaging the sets of activations that respectively correspond to the multiple different frames. 8. The method of claim 1 , wherein generating, by the computing device and in response to inputting the speech data that corresponds to the particular utterance to the neural network, the representation of activations occurring at the particular layer of the neural network comprises: generating, by the computing device, the representation of activations occurring at the particular layer of the neural network (i) in response to inputting the speech data that corresponds to the particular utterance of the neural network, and (ii) irrespective of any activations occurring downstream from the particular layer in response to inputting the speech data that corresponds to the particular utterance of the neural network. 9. The method of claim 8 , wherein inputting, by the computing device, speech data that corresponds to the particular utterance to the neural network having parameters trained based on propagation between the input layer and the output layer through one or more hidden layers located between the input layer and the output layer comprises: inputting, by the computing device, speech data that corresponds to the particular utterance to a neural network whose layers have been trained based on activations occurring at the output layer. 10. The method of claim 1 , wherein the representation of the activations at the particular layer is a vector that indicates the activations at the particular layer. 11. The method of claim 1 , wherein the input layer, the output layer, and the one or more hidden layers are included in a trained neural network; wherein inputting the speech data comprises inputting the speech data to a neural network that includes a subset of the layers of the trained neural network and excludes the output layer of the trained neural network used during training of the trained neural network; and wherein generating the representation comprises generating the representation of activations of a particular layer of the neural network that
Artificial neural networks; Connectionist approaches · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.