What technology area does this patent fall under?

Primary CPC classification G06N3/08. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Jun 23 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Sharp discrepancy learning

US2016180214A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2016180214-A1
Application number	US-201414577301-A
Country	US
Kind code	A1
Filing date	Dec 19, 2014
Priority date	Dec 19, 2014
Publication date	Jun 23, 2016
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. One of the methods includes training a neural network using sharp discrepancy learning by providing training data to the neural network, calculating a gradient using a sharp discrepancy output layer objective function to classify the neural network parameters for correct and incorrect network model states, and training the neural network using the gradient to determine a probability that data received by the neural network has features similar to key features of one or more keywords or key phrases.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method comprising: providing training data to a neural network that includes an output layer and one or more hidden layers, each of the hidden layers comprising multiple nodes and corresponding parameters; calculating a gradient for the neural network by applying a sharp discrepancy output layer objective function to the output layer, wherein the sharp discrepancy output layer objective function is dependent on the training data and parameters; training the neural network using the gradient to determine a probability that data received by the neural network has features similar to key features of one or more keywords or key phrases, wherein training the neural network using the gradient comprises using the gradient to update the parameters. 2 . The method of claim 1 , comprising providing the trained neural network for use in a speech recognition system, wherein the speech recognition system uses sharp discrepancy learning on real data. 3 . The method of claim 1 , wherein calculating the gradient for the neural network by applying a sharp discrepancy output layer objective function to the output layer comprises calculating the gradient of a cross-entropy function. 4 . The method of claim 1 , wherein the sharp discrepancy output layer objective function comprises a class of sharp discrepancy objective functions with a fraction whose denominator is a product of shifted label scores over a set of labels that correspond to a set of states that are designated as incorrect states. 5 . The method of claim 4 , wherein the label scores each comprise an exponential of a product of a label, parameter matrix and training data point. 6 . The method of claim 4 , wherein the class of sharp discrepancy objective functions comprise functions with a fraction whose numerator is a non-negative label score associated with a state that is designated as a correct state. 7 . The method of claim 1 , wherein calculating the gradient comprises calculating each component of the gradient separately. 8 . The method of claim 1 , wherein calculating the gradient comprises calculating each component of the gradient in parallel. 9 . The method of claim 1 , wherein the neural network comprises a deep neural network. 10 . The method of claim 1 , wherein the neural network comprises a deep belief network. 11 . The method of claim 1 , wherein the training data comprises a plurality of feature vectors and a plurality of label vectors that each indicate whether the corresponding feature vector corresponds to i) one of the keywords or key phrases, or ii) not. 12 . The method of claim 11 , wherein each of the plurality of feature vectors represent a different portion of an audio waveform from a received digital representation of speech. 13 . The method of claim 12 , wherein the digital representation of speech comprises recorded speech data. 14 . The method of claim 11 , wherein each of the plurality of label vectors corresponds to one of the feature vectors, and specifies a probability distribution for whether the corresponding feature vector corresponds to i) one of the keywords or key phrases, or ii) not. 15 . The method of claim 14 , wherein the probability distribution comprises a multinomial distribution. 16 . The method of claim 1 , wherein training the neural network using the gradient comprises iterating the parameter updates until an end criteria is met. 17 . The method of claim 1 , comprising calculating, using the hidden layers, an exponential of a product of a value of one of the parameters and a point from the training data. 18 . A system comprising: one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising: providing training data to a neural network that includes an output layer and one or more hidden layers, each of the hidden layers comprising multiple nodes and corresponding parameters; calculating a gradient for the neural network by applying a sharp discrepancy output layer objective function to the output layer, wherein the sharp discrepancy output layer objective function is dependent on the training data and parameters; training the neural network using the gradient to determine a probability that data received by the neural network has features similar to key features of one or more keywords or key phrases, wherein training the neural network using the gradient comprises using the gradient to update the parameters. 19 . The system of claim 18 , wherein the sharp discrepancy output layer objective function comprises a class of sharp discrepancy objective functions with a fraction whose denominator is a product of shifted label scores over a set of labels that correspond to a set of states that are designated as incorrect states. 20 . A computer-readable medium storing software comprising instructions executable by one or more computers which, upon such execution, cause the one or more computers to perform operations comprising: providing training data to a neural network that includes an output layer and one or more hidden layers, each of the hidden layers comprising multiple nodes and corresponding parameters; calculating a gradient for the neural network by applying a sharp discrepancy output layer objective function to the output layer, wherein the sharp discrepancy output layer objective function is dependent on the training data and parameters; training the neural network using the gradient to determine a probability that data received by the neural network has features similar to key features of one or more keywords or key phrases, wherein training the neural network using the gradient comprises using the gradient to update the parameters.

Assignees

Google Inc

Inventors

Classifications

G06N3/045
Combinations of networks · CPC title
G06N3/08Primary
Learning methods · CPC title
G06N3/0499
Feedforward networks · CPC title
G06N3/09
Supervised learning · CPC title
G06N99/005
Physics · mapped topic

Patent family

Related publications grouped by family.

View patent family 56129825

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016180214A1 cover?: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network. One of the methods includes training a neural network using sharp discrepancy learning by providing training data to the neural network, calculating a gradient using a sharp discrepancy output layer objective function to classify the neural network parameters for correc…
Who is the assignee on this patent?: Google Inc
What technology area does this patent fall under?: Primary CPC classification G06N3/08. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Jun 23 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Incremental learner via an adaptive mixture of weak learners distributed on a non-rigid binary tree

Sectioned memory networks for online word-spotting in continuous speech

Learning front-end speech recognition parameters within neural network training

Frequently asked questions