Method and system for role dependent context sensitive spoken and textual language understanding with neural networks

US9842106B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9842106-B2
Application numberUS-201514959132-A
CountryUS
Kind codeB2
Filing dateDec 4, 2015
Priority dateDec 4, 2015
Publication dateDec 12, 2017
Grant dateDec 12, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and system processes utterances that are acquired either from an automatic speech recognition (ASR) system or text. The utterances have associated identities of each party, such as role A utterances and role B utterances. The information corresponding to utterances, such as word sequence and identity, are converted to features. Each feature is received in an input layer of a neural network (NN). A dimensionality of each feature is reduced, in a projection layer of the NN, to produce a reduced dimensional feature. The reduced dimensional feature is processed to provide probabilities of labels for the utterances.

First claim

Opening claim text (preview).

We claim: 1. A method for processing utterances, comprising steps: acquiring utterances from multiple parties as word sequences, wherein each of the utterances has an associated identity of each party; converting the word sequences and identities to features; receiving, in an input layer of a neural network (NN), each of the features; reducing, in a projection layer of the NN, a dimensionality of each of the features to produce a reduced dimensional feature; processing, the reduced dimensional feature to propagate through hidden layers of the NN, wherein gates control outputs from the hidden layers based on the features of each party, wherein the hidden layers are party-dependent hidden layers or role-dependent hidden layers; determining, in an output layer of the NN, posterior probabilities of labels for the utterances based on the controlled outputs from the hidden layers estimating intentions of the utterances based on the posterior probabilities; and outputting the estimated intentions of the utterances, wherein the steps are performed in a processor. 2. The method of claim 1 , wherein the NN is a recurrent NN (RNN), or a RNN with a long short-term memory (LSTM) in hidden layers of the NN with recurrent connections in the hidden layers. 3. The method of claim 2 , wherein the LSTM includes recurrent connections with party gates that retain and forget context information. 4. The method of claim 3 , wherein the party gates control which one of the multiple parties is active. 5. The method of claim 3 , wherein the LSTM includes cells remember a value for an arbitrary length of time using the party gates. 6. The method of claim 1 , wherein the utterances are spoken, and further comprising: converting the utterances to the word sequences in an automatic speech recognition system (ASR). 7. The method of claim 1 , wherein the utterances are text to form the word sequences. 8. The method of claim 1 , wherein the NN is party-dependent neural networks with a shared context history of an entire dialog among the multiple parties. 9. The method of claim 1 , wherein the utterances form a dialog, and a context of the dialog is considered, and the probabilities of the labels are inferred using sentence-level intentions and the context of the dialog. 10. The method of claim 1 , wherein words in the word sequences and features are processed sequentially, and the features include semantic, syntactic, and task-oriented attributes. 11. The method of claim 1 , wherein the features are propagated through party-dependent hidden layers, and semantic information including concept tags are output at an end of each utterance, and wherein the concept tags only represent symbols, and the semantic information includes symbols and structured information. 12. The method of claim 1 , wherein the utterances are characterized in terms of roles of the multiple parties. 13. A method for processing utterances, comprising steps: acquiring utterances from multiple parties as word sequences, wherein each of the utterances has an associated identity of each party; converting the word sequences and identities to features; receiving, in an input layer of a neural network (NN), each of the features, wherein the NN is party-dependent neural networks with a shared context history of an entire dialog among the multiple parties; reducing, in a projection layer of the NN, a dimensionality of each of the features to produce a reduced dimensional feature; processing, the reduced dimensional feature to propagate through hidden layers of the NN, wherein gates control outputs from the hidden layers based on the features of each party, wherein the hidden layers are party-dependent hidden layers or role-dependent hidden layers; determining, in an output layer of the NN, posterior probabilities of labels for the utterances based on the controlled outputs from the hidden layers; estimating intentions of the utterances based on the posterior probabilities; and outputting the estimated intentions of the utterances, wherein the steps are performed in a processor. 14. The method of claim 13 , wherein the NN is a recurrent NN (RNN), or a RNN with a long short-term memory (LSTM) in hidden layers of the NN with recurrent connections in the hidden layers. 15. The method of claim 13 , wherein the utterances are spoken, and further comprising: converting the utterances to the word sequences in an automatic speech recognition system (ASR). 16. The method of claim 13 , wherein the utterances are text to form the word sequences. 17. The method of claim 13 , wherein the utterances form a dialog, and a context of the dialog is considered, and the probabilities of the labels are inferred using sentence-level intentions and the context of the dialog. 18. The method of claim 13 , wherein words in the word sequences and features are processed sequentially, and the features include semantic, syntactic, and task-oriented attributes. 19. The method of claim 13 , wherein the features are propagated through party-dependent hidden layers, and semantic information including concept tags are output at an end of each utterance, and wherein the concept tags only represent symbols, and the semantic information includes symbols and structured information. 20. The method of claim 13 , wherein the utterances are characterized in terms of roles of the multiple parties.

Assignees

Inventors

Classifications

  • Combinations of networks · CPC title

  • G10L15/16Primary

    using artificial neural networks · CPC title

  • Backpropagation, e.g. using gradient descent · CPC title

  • G06F40/35Primary

    Discourse or dialogue representation · CPC title

  • using context dependencies, e.g. language models · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9842106B2 cover?
A method and system processes utterances that are acquired either from an automatic speech recognition (ASR) system or text. The utterances have associated identities of each party, such as role A utterances and role B utterances. The information corresponding to utterances, such as word sequence and identity, are converted to features. Each feature is received in an input layer of a neural net…
Who is the assignee on this patent?
Mitsubishi Electric Res Laboratories Inc, Mitsubishi Electric Res Laboratories Inc
What technology area does this patent fall under?
Primary CPC classification G10L15/16. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 12 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).