Who is the assignee on this patent?

Mitsubishi Electric Res Laboratories Inc

What technology area does this patent fall under?

Primary CPC classification G10L15/063. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jan 08 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Method and system for training language models to reduce recognition errors

Patent metadata
Field	Value
Publication number	US-10176799-B2
Application number	US-201615013239-A
Country	US
Kind code	B2
Filing date	Feb 2, 2016
Priority date	Feb 2, 2016
Publication date	Jan 8, 2019
Grant date	Jan 8, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and for training a language model to reduce recognition errors, wherein the language model is a recurrent neural network language model (RNNLM) by first acquiring training samples. An automatic speech recognition system (ASR) is applied to the training samples to produce recognized words and probabilites of the recognized words, and an N-best list is selected from the recognized words based on the probabilities. determining word errors using reference data for hypotheses in the N-best list. The hypotheses are rescored using the RNNLM. Then, we determine gradients for the hypotheses using the word errors and gradients for words in the hypotheses. Lastly, parameters of the RNNLM are updated using a sum of the gradients.

First claim

Opening claim text (preview).

We claim: 1. A method for speech recognition to reduce recognition errors using a language model, wherein the language model is a recurrent neural network language model (RNNLM) that is in communication with a Long Short-Term Memory (LSTM), comprising the steps of: acquiring training samples during a training stage for training the RNNLM to perform applying an automatic speech recognition system (ASR) to the training samples to produce recognized words and probabilites of the recognized words; selecting an N-best list from the recognized words based on the probabilities; determining word errors using reference data for hypotheses in the N-best list; rescoring the hypotheses using the RNNLM in communication with the LSTM; determining gradients for the hypotheses using the word errors, wherein the determined gradients for the hypotheses corresponds to differences with respect to the N-best hypothesis scores; determining gradients for recognized words in the hypotheses; back-propagating the gradients; updating parameters of the RNNLM using a sum of the gradients as an error signal for the RNNLM, so as to the reduce recognition errors of the ASR; acquiring spoken utterances as an input to the RNNLM to produce the recognized words; producing the N-best list from the recognized words; and applying the RNNLM to the N-best list to obtain recognition results, wherein the steps are performed in a processor. 2. The method of claim 1 , wherein a stochastic gradient descent method is applied on an utterance-by-utterance basis so that the gradients are accumulated over the N-best list. 3. The method of claim 1 , wherein an output vector y t ∈[0,1] |V|+|C| (|C|, is a number of classes, includes of word (w) and class (c) outputs y t = [ y t ( w ) y t ( c ) ] , obtained as y t,m (w) =ζ( W ho,m (w) h t ), and y t (c) =ζ( W ho (c) h t ), where y t,m (w) and are sub-vector of y t (w) and sub-matrix of W ho corresponding to the words in an m-th class, respectively, and W ho (c) is a sub-matrix of W ho for the class output, where W ho is a matrix placed between a hidden layer and the output layer of the RNNLM, h t is a D dimensional activation vector h t ∈[0,1] D in a hidden layer, and ζ(⋅) denotes a softmax function that determines a softmax for elements of the vectors. 4. The method of claim 3 , wherein a word occurrence probability is P ( w t |h t )≡ y t,C(w t ) (w) [w t ]×y t (c) [C ( w t )] where C(w) denotes an index of the class to which the word w belongs. 5. The method of claim 4 , wherein a loss function of minimum word error training is L ⁡ ( Λ ) = ∑ k = 1 K ⁢ ∑ W ∈ V * ⁢ E ⁡ ( W k ( R ) , W ) ⁢ P Λ ⁡ ( W ❘ O k ) , where Λ is a set of model parameters, K is the number of utterances in training data, O k is a k-th acoustic observation sequence, and W k (R) ={w k,1 (R) , . . . , w k,T k (R) } is a k-th reference word sequence, E(W′,W) represents an edit distance between two word sequences W′ and W, and P Λ (W|O) is a posterior probability of W determined with the set of model parameter Λ. 6. The method of claim 5 , further comprising: obtaining, the the N-best lists and obtain a loss function ⁢ L ⁡ ( Λ ) = ∑ k = 1 K ⁢ ∑ N n = 1 ⁢ E ⁡ ( W k (

Assignees

Mitsubishi Electric Res Laboratories Inc

Inventors

Classifications

G06N3/045
Combinations of networks · CPC title
G06N7/01
Probabilistic graphical models, e.g. probabilistic networks · CPC title
G06N3/044
Recurrent networks, e.g. Hopfield networks · CPC title
G06N3/0442
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
G10L15/08
Speech classification or search · CPC title

Patent family

Related publications grouped by family.

View patent family 58347854

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10176799B2 cover?: A method and for training a language model to reduce recognition errors, wherein the language model is a recurrent neural network language model (RNNLM) by first acquiring training samples. An automatic speech recognition system (ASR) is applied to the training samples to produce recognized words and probabilites of the recognized words, and an N-best list is selected from the recognized words …
Who is the assignee on this patent?: Mitsubishi Electric Res Laboratories Inc
What technology area does this patent fall under?: Primary CPC classification G10L15/063. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jan 08 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).