Method and apparatus to provide comprehensive smart assistant services

US12499892B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12499892-B2
Application numberUS-202418744076-A
CountryUS
Kind codeB2
Filing dateJun 14, 2024
Priority dateFeb 8, 2018
Publication dateDec 16, 2025
Grant dateDec 16, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An apparatus supports smart assistant services with a plurality of smart service providers. The apparatus includes an audio device that receives a speech signal having a user utterance, captures the user utterance when the user utterance includes a user wake word, and sends the captured utterance to a backend computing device. The backend computing device replaces the user wake word with specific wake words associated with different smart service providers. The processed utterances are then sent to selected smart service providers. The backend computing device subsequently constructs feedback to the user utterance based on voice responses from the different smart service providers. The backend computing device then passes a digital representation of the feedback to the audio device, and the audio device converts the digital representation to an audio reply to the user utterance.

First claim

Opening claim text (preview).

What is claimed is: 1 . An apparatus comprising: one or more processors; and memory storing instructions that, when executed by the one or more processors, configure the apparatus to: detect a user trigger; capture, based on the detected user trigger, a user utterance from a user; include a first specific wake word in the captured user utterance to form a first processed utterance; send the first processed utterance to a first service provider; obtain, from the first service provider, a first response to the first processed utterance; and construct, based on data associated with a history of processed utterances, first feedback to the user utterance based on the first response; and generate, based on the first feedback, a reply to the user utterance. 2 . The apparatus of claim 1 , wherein the user trigger comprises one or more of a wake word, a sound, an audio condition, a hand gesture, a body gesture, a facial expression or biology signature. 3 . The apparatus of claim 1 , wherein the instructions, when executed by the one or more processors, further configure the apparatus to detect the user trigger based on a model trained on user input. 4 . The apparatus of claim 1 , wherein the data associated with the history of processed utterances comprises one or more of: scoring metrics indicative of accuracies of feedbacks to the processed utterances; a history of user utterances following processed utterances; or data indicating response speeds of a plurality of service providers to one or more previous captured user utterances. 5 . The apparatus of claim 1 , wherein the instructions, when executed by the one or more processors, further configure the apparatus to: include a second specific wake word, different from the first specific wake word, in the user utterance to form a second processed utterance; send the second processed utterance to a second service provider; obtain a second response from the second service provider, and construct the first feedback by combining, based on the data associated with the history of processed utterances, the first response and the second response. 6 . The apparatus of claim 1 , wherein the instructions, when executed by the one or more processors, further configure the apparatus to: include a second specific wake word, different from the first specific wake word, in the user utterance to form a second processed utterance; send the second processed utterance to a second service provider; obtain a second response from the second service provider, construct second feedback based on the second response; and update, based on the first feedback and on the second feedback, the data associated with the history of processed utterances. 7 . The apparatus of claim 1 , wherein the instructions, when executed by the one or more processors, further configure the apparatus to: obtain a scoring function by training on the data associated with the history of processed utterances, wherein the scoring function measures a probabilistic prediction accuracy; and construct the first feedback based on a scoring metric, obtained from the scoring function applied to the first response, satisfying a threshold. 8 . A method comprising: detecting, by a computing device via a sensor, a user trigger; capturing, based on the detected user trigger, a user utterance from a user; including a first specific wake word in the captured user utterance to form a first processed utterance; sending the first processed utterance to a first service provider; obtaining, from the first service provider, a first response to the first processed utterance; and constructing, based on data associated with a history of processed utterances, first feedback to the user utterance based on the first response; and generating, based on the first feedback, a reply to the user utterance. 9 . The method of claim 8 , wherein the user trigger comprises one or more of a wake word, a sound, an audio condition, a hand gesture, a body gesture, a facial expression or biology signature. 10 . The method of claim 8 , further comprising detecting the user trigger based on a model trained on user input. 11 . The method of claim 8 , wherein the data associated with the history of processed utterances comprises one or more of: scoring metrics indicative of accuracies of feedbacks to the processed utterances; a history of user utterances following processed utterances; or data indicating response speeds of a plurality of service providers to one or more previous captured user utterances. 12 . The method of claim 8 , further comprising: including a second specific wake word, different from the first specific wake word, in the user utterance to form a second processed utterance; sending the second processed utterance to a second service provider; obtaining a second response from the second service provider, and constructing the first feedback by combining, based on the data associated with the history of processed utterances, the first response and the second response. 13 . The method of claim 8 , further comprising including a second specific wake word, different from the first specific wake word, in the user utterance to form a second processed utterance; sending the second processed utterance to a second service provider; obtaining a second response from the second service provider, constructing second feedback based on the second response; and updating, based on the first feedback and on the second feedback, the data associated with the history of processed utterances. 14 . The method of claim 8 , further comprising: obtaining a scoring function by training on the data associated with the history of processed utterances, wherein the scoring function measures a probabilistic prediction accuracy; and constructing the first feedback based on a scoring metric, obtained from the scoring function applied to the first response, satisfying a threshold. 15 . A non-transitory computer readable medium storing instructions that, when executed, cause: detecting a user trigger; capturing, based on the detected user trigger, a user utterance from a user; including a first specific wake word in the captured user utterance to form a first processed utterance; sending the first processed utterance to a first service provider; obtaining, from the first service provider, a first response to the first processed utterance; and constructing, based on data associated with a history of processed utterances, first feedback to the user utterance based on the first response; and generating, based on the first feedback, a reply to the user utterance. 16 . The non-transitory computer readable medium of claim 15 , wherein the user trigger comprises one or more of a wake word, a sound, an audio condition, a hand gesture, a body gesture, a facial expression or biology signature. 17 . The non-transitory computer readable medium of claim 15 , further comprising detecting the user trigger based on a model trained on user input. 18 . The non-transitory computer readable medium of claim 15 , wherein the data associated with the history of processed utterances comprises one or more of: scoring metrics indicative of accuracies of feedbacks to the processed utterances; a history of user utterances following processed utterances; or data indicating response speeds of a plurality of service providers to one or more previous captured user utterances. 19 . The non-transitory computer readable med

Assignees

Inventors

Classifications

  • Feedback of the input speech · CPC title

  • Distributed recognition, e.g. in client-server systems, for mobile phones or network applications · CPC title

  • Execution procedure of a spoken command · CPC title

  • Word spotting · CPC title

  • Speech classification or search · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12499892B2 cover?
An apparatus supports smart assistant services with a plurality of smart service providers. The apparatus includes an audio device that receives a speech signal having a user utterance, captures the user utterance when the user utterance includes a user wake word, and sends the captured utterance to a backend computing device. The backend computing device replaces the user wake word with specif…
Who is the assignee on this patent?
Computime Ltd
What technology area does this patent fall under?
Primary CPC classification G10L15/22. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 16 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 11 related publications on this page (citations in our corpus or others sharing the same primary CPC).