Goal segmentation in speech dialogs
US-10236017-B1 · Mar 19, 2019 · US
US11017767B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11017767-B2 |
| Application number | US-201715473409-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 29, 2017 |
| Priority date | Mar 29, 2016 |
| Publication date | May 25, 2021 |
| Grant date | May 25, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Described herein are systems and methods for providing hierarchical state tracking in a spoken dialogue system. A sequence of turns is received by a spoken dialogue system. Each turn includes a user utterance and a machine act. At each turn, a value pointer and a turn pointer are provided for that turn. The value pointer represents a probability distribution over the one or more words in the user utterance that indicates whether each word in the user utterance is a slot value for a slot. The turn pointer identifies which turn in a set of turns includes a currently-relevant slot value for the slot, where the set of turns includes a current turn for which the turn point is being provided, and all turns that precede the current turn.
Opening claim text (preview).
The invention claimed is: 1. A computer-implemented method of state tracking in a spoken dialogue system, the method comprising: receiving a sequence of turns, each turn comprising: a numerical identifier; a user utterance comprising one or more words received at the spoken dialogue system; and a machine act comprising one or more words produced by the spoken dialogue system; providing, by the spoken dialogue system using a hierarchical pointer network that relates a slot value in one turn to a slot value in another turn, a first value pointer, wherein the first value pointer indicates a first slot value for a first slot based on a first user utterance in a first turn of the received sequence of turns; providing, by the spoken dialogue system, a first turn pointer for the first turn of the received sequence of turns, wherein the first turn is a current turn and includes a first numerical identifier, wherein the first turn pointer includes a second numerical identifier of a second turn in the sequence of turns, wherein the second turn is a prior turn of the sequence of turns, wherein the second turn is distinct from the first turn and includes at least one of a second user utterance or a second machine act having a second slot value for a second slot, and wherein the second slot value matches the first slot value; determining a first dialogue state for the first turn based at least on a combination of the first value pointer and the first turn pointer, wherein the first dialogue state is determined based on a predicted context output of the hierarchical pointer network; and determining a first machine act of the first turn to be performed by the spoken dialogue system based on the determined first dialogue state. 2. The computer-implemented method of claim 1 , wherein the first value pointer comprises a probability distribution over the one or more words in the first user utterance or over the one or more words in a knowledge database. 3. The computer-implemented method of claim 2 , wherein providing the first value pointer comprises producing the probability distribution over the one or more words in the first user utterance to indicate whether each word in the first user utterance is the first slot value for the first slot. 4. The computer-implemented method of claim 3 , wherein the probability distribution is a first probability distribution and the operation of providing the first value pointer further comprises: determining a second probability distribution that a user affirmed the first slot value mentioned in the first machine act of the first turn; and when the user affirmed the first slot value, causing the first value pointer to point to the word cited in the first machine act. 5. The computer-implemented method of claim 2 , wherein providing the first value pointer comprises producing the probability distribution by comparing the one or more words in the first user utterance to corresponding one or more words in the knowledge database. 6. The computer-implemented method of claim 5 , wherein the probability distribution is a first probability distribution, and wherein providing the first value pointer further comprises: for a respective turn in the sequence of turns, determining a second probability distribution that a user affirmed the first slot value mentioned in the first machine act of the first turn; and when the user affirmed the first slot value, causing the first value pointer to point to a word cited in the first machine act. 7. The computer-implemented method of claim 5 , wherein providing the first value pointer comprises processing each word in the first user utterance, and wherein the hierarchical pointer network is configured as a recurrent neural network. 8. The computer-implemented method of claim 7 , wherein the recurrent neural network comprises a bi-directional neural network. 9. The computer-implemented method of claim 2 , wherein the probability distribution is a first probability distribution, and wherein determining the first dialogue state for the first slot comprises determining, for each slot, a second probability distribution over all possible slot values for every slot. 10. The computer-implemented method of claim 9 , further comprising: determining the first machine act to be performed by the spoken dialogue system based on the second probability distribution over all possible slot values for every slot; and causing the spoken dialogue system to perform the first machine act. 11. The computer-implemented method of claim 10 , wherein the first machine act comprises: asking a confirming question; asking for more information; or sending a message. 12. A system, comprising: at least one processing unit; and at least one memory storing computer executable instructions that, when executed by the at least one processing unit, cause the system to: receive a sequence of turns, each turn comprising: a numerical identifier; a user utterance comprising one or more words received at a spoken dialogue system; and a machine act comprising one or more words produced by the spoken dialogue system; provide, by the spoken dialogue system using a hierarchical pointer network that relates a slot value in one turn to a slot value in another turn, a first value pointer, wherein the first value pointer indicates a first slot value for a first slot based on a first user utterance in a first turn of the received sequence of turns; provide, by the spoken dialogue system, a first turn pointer for the first turn of the received sequence of turns, wherein the first turn is a current turn and includes a first numerical identifier, wherein the first turn pointer includes a second numerical identifier of a second turn in the sequence of turns, wherein the second turn is a prior turn of the sequence of turns, wherein the second turn is distinct from the first turn and includes at least one of a second user utterance or a second machine act having a second slot value for a second slot, the second turn associated with a designator identifying an utterance type or a machine act type, respectively, and wherein the second slot value matches the first slot value; determine a first dialogue state for the first turn based at least on a combination of the first value pointer and the first turn pointer, wherein the first dialogue state is determined based on a predicted context output of the hierarchical pointer network; and determine a first machine act of the first turn to be performed by the spoken dialogue system based on the determined first dialogue state. 13. The system of claim 12 , further comprising instructions for accessing a knowledge database. 14. The system of claim 13 , wherein the first value pointer comprises a probability distribution over the one or more words in the first user utterance or over the one or more words in a knowledge database. 15. The system of claim 14 , wherein the instructions for providing the first value pointer comprise instructions for: producing the probability distribution by comparing the one or more words in the first user utterance to corresponding one or more words in the knowledge database; or producing the probability distribution over the one or more words in the first user utterance to indicate whether each word in the first user utterance is the first slot value for the first slot. 16. The system of claim 15 , wherein the probability distribution is a first probability distribution and the instructions for providing the first value pointer further comprises instructions for: determining a se
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination · CPC title
Speech to text systems (G10L15/08 takes precedence) · CPC title
Probabilistic grammars, e.g. word n-grams · CPC title
Execution procedure of a spoken command · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.