Speech interface device
US-2019295552-A1 · Sep 26, 2019 · US
US11948564B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11948564-B2 |
| Application number | US-201916973026-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 13, 2019 |
| Priority date | Jun 15, 2018 |
| Publication date | Apr 2, 2024 |
| Grant date | Apr 2, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Provided is an information processing device including a response control unit that controls a response to a user's utterance based on a first utterance interpretation result and a second utterance interpretation result. The first utterance interpretation result is a result of natural language understanding processing for an utterance text generated by automatic speech recognition processing based on the user's utterance and the second utterance interpretation result is an interpretation result acquired based on learning data in which the first utterance interpretation result and the utterance text used to acquire the first utterance interpretation result are associated with each other. The response control unit further controls the response to the user's utterance based on the second utterance interpretation result in a case where the second utterance interpretation result is acquired based on the user's utterance before acquisition of the first utterance interpretation result.
Opening claim text (preview).
The invention claimed is: 1. An information processing device, comprising: a processor configured to: control a response to a user's utterance based on a first utterance interpretation result and a second utterance interpretation result, wherein the first utterance interpretation result is acquired during an output of a connecting system utterance, the first utterance interpretation result and the second utterance interpretation result are different from each other, the first utterance interpretation result is a result of natural language understanding processing for an utterance text generated by automatic speech recognition processing based on the user's utterance, and the second utterance interpretation result is an interpretation result acquired based on learning data in which the first utterance interpretation result and the utterance text used to acquire the first utterance interpretation result are associated with each other; and control to output the connecting system utterance based on the acquisition of the second utterance interpretation result. 2. The information processing device according to claim 1 , wherein the processor is further configured to control the response to the user's utterance based on the first utterance interpretation result in a case where the second utterance interpretation result is unavailable. 3. The information processing device according to claim 1 , wherein the processor is further configured to determine a length of the connecting system utterance based on a recovery cost for recovering a result of the response based on the second utterance interpretation result. 4. The information processing device according to claim 3 , wherein in a case where the recovery cost is equal to or more than a specific threshold value, the processor is further configured to: control to output the connecting system utterance to inquire for execution of the response based on the second utterance interpretation result; and perform a control to execute the response based on the second utterance interpretation result based on user approval. 5. The information processing device according to claim 1 , wherein in a case where a plurality of second utterance interpretation results are acquired based on the user's utterance, the plurality of second utterance interpretation results includes the second utterance interpretation result, the processor is further configured to control the response to the user's utterance based on a similarity between a first context and a second context, the first context is acquired based on the user's utterance, and the second context is stored together with the learning data. 6. The information processing device according to claim 1 , wherein the processor is further configured to: learn the utterance text generated by the automatic speech recognition processing based on the user's utterance and the first utterance interpretation result; and store, as the learning data, the utterance text and the first utterance interpretation result. 7. The information processing device according to claim 3 , wherein in a case where a number of times of the acquisition of the same first utterance interpretation result based on the same utterance text is equal to or more than a specific threshold value, the processor is further configured to: register a phrase corresponding to the utterance text in a phrase dictionary; and acquire a third utterance interpretation result corresponding to the phrase from the learning data based on a recognition of the phrase by phrase speech recognition using the phrase dictionary. 8. The information processing device according to claim 7 , wherein the processor is further configured to: register a first utterance end detection time for the phrase in the phrase dictionary together with the phrase; and reduce the first utterance end detection time as the number of times of the acquisition of the same first utterance interpretation result is increased. 9. The information processing device according to claim 8 , wherein in a case where a length of the first utterance end detection time is zero, the processor is further configured to: register a short phrase that excludes a part of the phrase in the phrase dictionary; and store new learning data for the short phrase that takes over the first utterance interpretation result from the learning data for the phrase. 10. The information processing device according to claim 9 , wherein in a case where the length of the first utterance end detection time is zero, the processor is further configured to register, in the phrase dictionary, the short phrase that excludes a part of an ending of the phrase. 11. The information processing device according to claim 9 , wherein the processor is further configured to register a second utterance end detection time for the short phrase in the phrase dictionary together with the short phrase. 12. The information processing device according to claim 11 , wherein the processor is further configured to reduce the second utterance end detection time for the short phrase to a specific length as a number of times of recognition of the short phrase is increased. 13. The information processing device according to claim 12 , wherein in a case where the acquired first utterance interpretation result and the second utterance interpretation result acquired based on a recognition of the short phrase are different, the processor is further configured to extend the second utterance end detection time for the short phrase. 14. The information processing device according to claim 7 , wherein the processor is further configured to perform phrase speech recognition processing based on the user's utterance by using the phrase dictionary. 15. The information processing device according to claim 1 , wherein the processor is further configured to: output speech data for the user's utterance to an external device; receive the utterance text generated by the automatic speech recognition processing based on the speech data, and receive the first utterance interpretation result which is a result of the natural language understanding processing for the utterance text. 16. The information processing device according to claim 15 , wherein the processor is further configured to store a synthesized speech synthesized by the external device. 17. The information processing device according to claim 16 , further comprising a speaker configured to output the stored synthesized speech. 18. An information processing method, comprising: controlling, by a processor, a response to a user's utterance based on a first utterance interpretation result and a second utterance interpretation result, wherein the first utterance interpretation result is acquired during an output of a connecting system utterance, the first utterance interpretation result and the second utterance interpretation result are different from each other, the first utterance interpretation result is a result of natural language understanding processing for an utterance text generated by automatic speech recognition processing based on the user's utterance, and the second utterance interpretation result is an interpretation result acquired based on learning data in which the first utterance interpretation result and the utterance text used to acquire the first utterance interpretation result are associated with each other; and controlling, by the processor, a speaker to output the connecting system utterance based on the acquisition
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
Details of speech synthesis systems, e.g. synthesiser structure or memory management · CPC title
Training · CPC title
Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning · CPC title
Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.