What technology area does this patent fall under?

Primary CPC classification G10L15/16. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Dec 20 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

Systems and methods for providing unnormalized language models

US9524716B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9524716-B2
Application number	US-201514689564-A
Country	US
Kind code	B2
Filing date	Apr 17, 2015
Priority date	Apr 17, 2015
Publication date	Dec 20, 2016
Grant date	Dec 20, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Some embodiments relate to using an unnormalized neural network language model in connection with a speech processing application. The techniques include obtaining a language segment sequence comprising one or more language segments in a vocabulary; accessing an unnormalized neural network language model having a normalizer node and an output layer comprising a plurality of output nodes, each of the plurality of output nodes associated with a respective language segment in the vocabulary; and determining, using the unnormalized neural network language model, a first likelihood that a first language segment in the vocabulary follows the language segment sequence.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: receiving, at a server, a representation of a voice utterance received by an application program executing on a client device; recognizing, using an automated speech recognition (ASR) engine executing at the server, the voice utterance to obtain a recognition result, the recognizing comprising: obtaining, based on the voice utterance, a language segment sequence comprising one or more language segments in a vocabulary of language segments; accessing an unnormalized neural network language model having a normalizer node and an output layer comprising a plurality of output nodes, each of the plurality of output nodes associated with a respective language segment in the vocabulary, wherein the plurality of output nodes includes a first output node associated with the first language segment in the vocabulary; determining the recognition result at least in part by determining, using the unnormalized neural network language model, a first likelihood that a first language segment in the vocabulary follows the language segment sequence, wherein determining the first likelihood comprises: determining, based at least in part on features derived from the language segment sequence, an output score for the first output node; determining, based at least in part on the features, an output score for the normalizer node; and determining the first likelihood based on the output score for the first output node and the output score for the normalizer node, wherein determining the first likelihood that the first language segment in the vocabulary follows the language segment sequence is performed independently of output scores of any output nodes, other than the first output node, in the plurality of output nodes; and providing, by the server, the recognition result to the application program executing on the client device. 2. The method of claim 1 , wherein the output score for the normalizer node is an estimate of a sum of output scores of output nodes in the plurality of output nodes. 3. The method of claim 1 , wherein the normalizer node is associated with at least one node in at least one hidden layer of the unnormalized neural network language model, and wherein the normalizer node is not in the plurality of output nodes. 4. The method of claim 1 , further comprising using the first likelihood in performing a language processing task without normalizing the first likelihood relative to likelihoods that other language segments in the vocabulary follow the language segment sequence. 5. The method of claim 1 , wherein the unnormalized neural network function is trained by using an objective function comprising an unnormalized likelihood term and a generalized minimum KL divergence penalty term or a variance regularization penalty term. 6. A system, comprising: at least one non-transitory computer-readable storage medium storing thereon an unnormalized neural network language model having a normalizer node and an output layer comprising a plurality of output nodes, each of the plurality of output nodes associated with a respective language segment in a vocabulary of language segments, wherein the plurality of output nodes includes a first output node associated with the first language segment in the vocabulary; and at least one server configured to perform a method comprising: receiving a representation of a voice utterance received by an application program executing on a client device; recognizing, using an automated speech recognition (ASR) engine, the voice utterance to obtain a recognition result, the recognizing comprising: obtaining, based on the voice utterance, a language segment sequence comprising one or more language segments in a vocabulary of language segments; accessing the unnormalized neural network language model stored on the at least one non-transitory computer-readable storage medium; accessing an unnormalized neural network language model having a normalizer node and an output layer comprising a plurality of output nodes, each of the plurality of output nodes associated with a respective language segment in the vocabulary; determining the recognition result at least in part by determining, using the unnormalized neural network language model, a first likelihood that a first language segment in the vocabulary follows the language segment sequence, wherein determining the first likelihood comprises: determining, based at least in part on features derived from the language segment sequence, an output score for the first output node; determining, based at least in part on the features, an output score for the normalizer node; and determining the first likelihood based on the output score for the first output node and the output score for the normalizer node, wherein determining the first likelihood that the first language segment in the vocabulary follows the language segment sequence is performed independently of output scores of any output nodes, other than the first output node, in the plurality of output nodes; and providing the recognition result to the application program executing on the client device. 7. The system of claim 6 , wherein the output score for the normalizer node is an estimate of a sum of output scores of output nodes in the plurality of output nodes. 8. The system of claim 6 , wherein the normalizer node is associated with at least one node in at least one hidden layer of the unnormalized neural network language model, and wherein the normalizer node is not in the plurality of output nodes. 9. The system of claim 6 , further comprising using the first likelihood in performing a language processing task without normalizing the first likelihood relative to likelihoods that other language segments in the vocabulary follow the language segment sequence. 10. The system of claim 6 , wherein the unnormalized neural network function is trained by using an objective function comprising an unnormalized likelihood term and a generalized minimum KL divergence penalty term or a variance regularization penalty term. 11. At least one non-transitory computer-readable storage medium storing processor-executable instructions that, when executed by at least one server comprising a computer hardware processor, cause the at least one server to perform a method comprising: receiving, at the server, a representation of a voice utterance received by an application program executing on a client device; recognizing, using an automated speech recognition (ASR) engine executing at the server, the voice utterance to obtain a recognition result, the recognizing comprising: obtaining, based on the voice utterance, a language segment sequence comprising one or more language segments in a vocabulary of language segments; accessing an unnormalized neural network language model having a normalizer node and an output layer comprising a plurality of output nodes, each of the plurality of output nodes associated with a respective language segment in the vocabulary, wherein the plurality of output nodes includes a first output node associated with the first language segment in the vocabulary; determining the recognition result at least in part by determining, using the unnormalized neural network language model, a first likelihood that a first language segment in the vocabulary follows the language segment sequence, wherein determining the first likelihood comprises: determining, based at least in part on features derived from the language segment sequence, an output score for the first output node; determining, based at least in part on the features, an output score for the normalizer node; and determining the first likelihood based on t

Assignees

Nuance Communications Inc

Inventors

Classifications

G10L15/16Primary
using artificial neural networks · CPC title
G10L15/063
Training · CPC title
G10L15/197
Probabilistic grammars, e.g. word n-grams · CPC title

Patent family

Related publications grouped by family.

View patent family 57128532

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9524716B2 cover?: Some embodiments relate to using an unnormalized neural network language model in connection with a speech processing application. The techniques include obtaining a language segment sequence comprising one or more language segments in a vocabulary; accessing an unnormalized neural network language model having a normalizer node and an output layer comprising a plurality of output nodes, each o…
Who is the assignee on this patent?: Nuance Communications Inc
What technology area does this patent fall under?: Primary CPC classification G10L15/16. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Dec 20 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).