Information processing device and information processing method

US11948564B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11948564-B2
Application numberUS-201916973026-A
CountryUS
Kind codeB2
Filing dateMar 13, 2019
Priority dateJun 15, 2018
Publication dateApr 2, 2024
Grant dateApr 2, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Provided is an information processing device including a response control unit that controls a response to a user's utterance based on a first utterance interpretation result and a second utterance interpretation result. The first utterance interpretation result is a result of natural language understanding processing for an utterance text generated by automatic speech recognition processing based on the user's utterance and the second utterance interpretation result is an interpretation result acquired based on learning data in which the first utterance interpretation result and the utterance text used to acquire the first utterance interpretation result are associated with each other. The response control unit further controls the response to the user's utterance based on the second utterance interpretation result in a case where the second utterance interpretation result is acquired based on the user's utterance before acquisition of the first utterance interpretation result.

First claim

Opening claim text (preview).

The invention claimed is: 1. An information processing device, comprising: a processor configured to: control a response to a user's utterance based on a first utterance interpretation result and a second utterance interpretation result, wherein the first utterance interpretation result is acquired during an output of a connecting system utterance, the first utterance interpretation result and the second utterance interpretation result are different from each other, the first utterance interpretation result is a result of natural language understanding processing for an utterance text generated by automatic speech recognition processing based on the user's utterance, and the second utterance interpretation result is an interpretation result acquired based on learning data in which the first utterance interpretation result and the utterance text used to acquire the first utterance interpretation result are associated with each other; and control to output the connecting system utterance based on the acquisition of the second utterance interpretation result. 2. The information processing device according to claim 1 , wherein the processor is further configured to control the response to the user's utterance based on the first utterance interpretation result in a case where the second utterance interpretation result is unavailable. 3. The information processing device according to claim 1 , wherein the processor is further configured to determine a length of the connecting system utterance based on a recovery cost for recovering a result of the response based on the second utterance interpretation result. 4. The information processing device according to claim 3 , wherein in a case where the recovery cost is equal to or more than a specific threshold value, the processor is further configured to: control to output the connecting system utterance to inquire for execution of the response based on the second utterance interpretation result; and perform a control to execute the response based on the second utterance interpretation result based on user approval. 5. The information processing device according to claim 1 , wherein in a case where a plurality of second utterance interpretation results are acquired based on the user's utterance, the plurality of second utterance interpretation results includes the second utterance interpretation result, the processor is further configured to control the response to the user's utterance based on a similarity between a first context and a second context, the first context is acquired based on the user's utterance, and the second context is stored together with the learning data. 6. The information processing device according to claim 1 , wherein the processor is further configured to: learn the utterance text generated by the automatic speech recognition processing based on the user's utterance and the first utterance interpretation result; and store, as the learning data, the utterance text and the first utterance interpretation result. 7. The information processing device according to claim 3 , wherein in a case where a number of times of the acquisition of the same first utterance interpretation result based on the same utterance text is equal to or more than a specific threshold value, the processor is further configured to: register a phrase corresponding to the utterance text in a phrase dictionary; and acquire a third utterance interpretation result corresponding to the phrase from the learning data based on a recognition of the phrase by phrase speech recognition using the phrase dictionary. 8. The information processing device according to claim 7 , wherein the processor is further configured to: register a first utterance end detection time for the phrase in the phrase dictionary together with the phrase; and reduce the first utterance end detection time as the number of times of the acquisition of the same first utterance interpretation result is increased. 9. The information processing device according to claim 8 , wherein in a case where a length of the first utterance end detection time is zero, the processor is further configured to: register a short phrase that excludes a part of the phrase in the phrase dictionary; and store new learning data for the short phrase that takes over the first utterance interpretation result from the learning data for the phrase. 10. The information processing device according to claim 9 , wherein in a case where the length of the first utterance end detection time is zero, the processor is further configured to register, in the phrase dictionary, the short phrase that excludes a part of an ending of the phrase. 11. The information processing device according to claim 9 , wherein the processor is further configured to register a second utterance end detection time for the short phrase in the phrase dictionary together with the short phrase. 12. The information processing device according to claim 11 , wherein the processor is further configured to reduce the second utterance end detection time for the short phrase to a specific length as a number of times of recognition of the short phrase is increased. 13. The information processing device according to claim 12 , wherein in a case where the acquired first utterance interpretation result and the second utterance interpretation result acquired based on a recognition of the short phrase are different, the processor is further configured to extend the second utterance end detection time for the short phrase. 14. The information processing device according to claim 7 , wherein the processor is further configured to perform phrase speech recognition processing based on the user's utterance by using the phrase dictionary. 15. The information processing device according to claim 1 , wherein the processor is further configured to: output speech data for the user's utterance to an external device; receive the utterance text generated by the automatic speech recognition processing based on the speech data, and receive the first utterance interpretation result which is a result of the natural language understanding processing for the utterance text. 16. The information processing device according to claim 15 , wherein the processor is further configured to store a synthesized speech synthesized by the external device. 17. The information processing device according to claim 16 , further comprising a speaker configured to output the stored synthesized speech. 18. An information processing method, comprising: controlling, by a processor, a response to a user's utterance based on a first utterance interpretation result and a second utterance interpretation result, wherein the first utterance interpretation result is acquired during an output of a connecting system utterance, the first utterance interpretation result and the second utterance interpretation result are different from each other, the first utterance interpretation result is a result of natural language understanding processing for an utterance text generated by automatic speech recognition processing based on the user's utterance, and the second utterance interpretation result is an interpretation result acquired based on learning data in which the first utterance interpretation result and the utterance text used to acquire the first utterance interpretation result are associated with each other; and controlling, by the processor, a speaker to output the connecting system utterance based on the acquisition

Assignees

Inventors

Classifications

  • G10L15/22Primary

    Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

  • Details of speech synthesis systems, e.g. synthesiser structure or memory management · CPC title

  • Training · CPC title

  • Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning · CPC title

  • Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11948564B2 cover?
Provided is an information processing device including a response control unit that controls a response to a user's utterance based on a first utterance interpretation result and a second utterance interpretation result. The first utterance interpretation result is a result of natural language understanding processing for an utterance text generated by automatic speech recognition processing ba…
Who is the assignee on this patent?
Sony Corp
What technology area does this patent fall under?
Primary CPC classification G10L15/22. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 02 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).