Learning Student DNN Via Output Distribution
US-2016078339-A1 · Mar 17, 2016 · US
US9842106B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9842106-B2 |
| Application number | US-201514959132-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 4, 2015 |
| Priority date | Dec 4, 2015 |
| Publication date | Dec 12, 2017 |
| Grant date | Dec 12, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method and system processes utterances that are acquired either from an automatic speech recognition (ASR) system or text. The utterances have associated identities of each party, such as role A utterances and role B utterances. The information corresponding to utterances, such as word sequence and identity, are converted to features. Each feature is received in an input layer of a neural network (NN). A dimensionality of each feature is reduced, in a projection layer of the NN, to produce a reduced dimensional feature. The reduced dimensional feature is processed to provide probabilities of labels for the utterances.
Opening claim text (preview).
We claim: 1. A method for processing utterances, comprising steps: acquiring utterances from multiple parties as word sequences, wherein each of the utterances has an associated identity of each party; converting the word sequences and identities to features; receiving, in an input layer of a neural network (NN), each of the features; reducing, in a projection layer of the NN, a dimensionality of each of the features to produce a reduced dimensional feature; processing, the reduced dimensional feature to propagate through hidden layers of the NN, wherein gates control outputs from the hidden layers based on the features of each party, wherein the hidden layers are party-dependent hidden layers or role-dependent hidden layers; determining, in an output layer of the NN, posterior probabilities of labels for the utterances based on the controlled outputs from the hidden layers estimating intentions of the utterances based on the posterior probabilities; and outputting the estimated intentions of the utterances, wherein the steps are performed in a processor. 2. The method of claim 1 , wherein the NN is a recurrent NN (RNN), or a RNN with a long short-term memory (LSTM) in hidden layers of the NN with recurrent connections in the hidden layers. 3. The method of claim 2 , wherein the LSTM includes recurrent connections with party gates that retain and forget context information. 4. The method of claim 3 , wherein the party gates control which one of the multiple parties is active. 5. The method of claim 3 , wherein the LSTM includes cells remember a value for an arbitrary length of time using the party gates. 6. The method of claim 1 , wherein the utterances are spoken, and further comprising: converting the utterances to the word sequences in an automatic speech recognition system (ASR). 7. The method of claim 1 , wherein the utterances are text to form the word sequences. 8. The method of claim 1 , wherein the NN is party-dependent neural networks with a shared context history of an entire dialog among the multiple parties. 9. The method of claim 1 , wherein the utterances form a dialog, and a context of the dialog is considered, and the probabilities of the labels are inferred using sentence-level intentions and the context of the dialog. 10. The method of claim 1 , wherein words in the word sequences and features are processed sequentially, and the features include semantic, syntactic, and task-oriented attributes. 11. The method of claim 1 , wherein the features are propagated through party-dependent hidden layers, and semantic information including concept tags are output at an end of each utterance, and wherein the concept tags only represent symbols, and the semantic information includes symbols and structured information. 12. The method of claim 1 , wherein the utterances are characterized in terms of roles of the multiple parties. 13. A method for processing utterances, comprising steps: acquiring utterances from multiple parties as word sequences, wherein each of the utterances has an associated identity of each party; converting the word sequences and identities to features; receiving, in an input layer of a neural network (NN), each of the features, wherein the NN is party-dependent neural networks with a shared context history of an entire dialog among the multiple parties; reducing, in a projection layer of the NN, a dimensionality of each of the features to produce a reduced dimensional feature; processing, the reduced dimensional feature to propagate through hidden layers of the NN, wherein gates control outputs from the hidden layers based on the features of each party, wherein the hidden layers are party-dependent hidden layers or role-dependent hidden layers; determining, in an output layer of the NN, posterior probabilities of labels for the utterances based on the controlled outputs from the hidden layers; estimating intentions of the utterances based on the posterior probabilities; and outputting the estimated intentions of the utterances, wherein the steps are performed in a processor. 14. The method of claim 13 , wherein the NN is a recurrent NN (RNN), or a RNN with a long short-term memory (LSTM) in hidden layers of the NN with recurrent connections in the hidden layers. 15. The method of claim 13 , wherein the utterances are spoken, and further comprising: converting the utterances to the word sequences in an automatic speech recognition system (ASR). 16. The method of claim 13 , wherein the utterances are text to form the word sequences. 17. The method of claim 13 , wherein the utterances form a dialog, and a context of the dialog is considered, and the probabilities of the labels are inferred using sentence-level intentions and the context of the dialog. 18. The method of claim 13 , wherein words in the word sequences and features are processed sequentially, and the features include semantic, syntactic, and task-oriented attributes. 19. The method of claim 13 , wherein the features are propagated through party-dependent hidden layers, and semantic information including concept tags are output at an end of each utterance, and wherein the concept tags only represent symbols, and the semantic information includes symbols and structured information. 20. The method of claim 13 , wherein the utterances are characterized in terms of roles of the multiple parties.
Related publications grouped by family.
Answers are generated from the same data shown on this page.