What technology area does this patent fall under?

Primary CPC classification H04L12/2816. Mapped technology areas include Electricity.

When was this patent published?

Publication date Tue Oct 06 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Method and system for controlling home assistant devices

US10796702B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10796702-B2
Application number	US-201816230835-A
Country	US
Kind code	B2
Filing date	Dec 21, 2018
Priority date	Dec 31, 2017
Publication date	Oct 6, 2020
Grant date	Oct 6, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

System and method for controlling a home assistant device include: receiving an audio input; performing speaker recognition on the audio input; in accordance with a determination that the audio input includes a voice input from a first user that is authorized to control the home assistant device: performing speech-to-text conversion on the audio input to obtain a textual string; and searching for a predefined trigger word for activating the home assistant device in the textual string; and in accordance with a determination that the audio input includes a voice input from the home assistant device: forgoing performance of speech-to-text conversion on the audio input; and forgoing search for the predefined trigger word.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of controlling a home assistant device, comprising: at a computing system having one or more processors and memory: receiving an audio input; performing speaker recognition on the audio input; in accordance with a determination from performing speaker recognition that the audio input includes a voice input from a first user that is authorized to control the home assistant device: performing, using speech recognition, speech-to-text conversion on the audio input to obtain a textual string; searching for a predefined trigger word for activating the home assistant device in the textual string; selecting, from a plurality of task domains of the home assistant device, one or more first task domains that the first user is authorized to control, to perform intent deduction on the textual string; and forgoing using one or more second task domains among the plurality of task domains that the first user is not authorized to control to process the textual string; and in accordance with a determination from performing speaker recognition that the audio input includes a voice input from the home assistant device: forgoing performance of speech-to-text conversion on the audio input; and forgoing search for the predefined trigger word, so that the home assistant device avoids being triggered by the home assistant device's own speech or a speech output of a neighboring home assistant device, wherein the speaker recognition uses less resources than the speech recognition. 2. The method of claim 1 , wherein searching for the predefined trigger word in the textual string includes: selecting a respective trigger word that corresponds to the first user from a plurality of preset trigger words that correspond different users among a plurality of users that include the first user; and using the respective trigger word that corresponds to the first user as the predefined trigger word that is to be searched. 3. The method of claim 1 , including: obtaining a default speech-to-text model corresponding to the home assistant device; and in accordance with a determination that a plurality of recorded speech samples provided by the first user are available, adjusting the default speech-to-text model in accordance with the plurality of recorded speech samples provided by the first user to generate a first user-specific speech-to-text model for the first user, wherein performing speech-to-text conversion on the audio input to obtain the textual string includes performing speech-to-text conversion on the audio input using the first user-specific speech-to-text model for the first user. 4. The method of claim 3 , including: in accordance with a determination that a plurality of recorded speech samples provided by the first user are not available, performing the speech-to-text conversion on the audio input using the default speech-to-text model. 5. The method of claim 4 , including: in accordance with a determination that a plurality of recorded speech samples provided by the first user are available, setting a first confidence threshold for recognizing the trigger word in the audio input when the first user-specific speech-to-text model is used to perform the speech-to-text conversion on the audio input; and in accordance with a determination that a plurality of recorded speech samples provided by the first user are not available, setting a second confidence threshold for recognizing the trigger word in the audio input when the default speech-to-text model is used to perform the speech-to-text conversion on the audio input. 6. The method of claim 5 , wherein the first confidence threshold that is used for the first user-specific speech-to-text model is higher than the second confidence threshold that is used for the default speech-to-text model. 7. A system for controlling a home assistant device, comprising: one or more processors; and memory storing instructions, the instructions, when executed by the processors, cause the processors to perform operations comprising: receiving an audio input; performing speaker recognition on the audio input; in accordance with a determination from performing speaker recognition that the audio input includes a voice input from a first user that is authorized to control the home assistant device: performing, using speech recognition, speech-to-text conversion on the audio input to obtain a textual string; searching for a predefined trigger word for activating the home assistant device in the textual string; selecting, from a plurality of task domains of the home assistant device, one or more first task domains that the first user is authorized to control, to perform intent deduction on the textual string; and forgoing using one or more second task domains among the plurality of task domains that the first user is not authorized to control to process the textual string; and in accordance with a determination from performing speaker recognition that the audio input includes a voice input from the home assistant device: forgoing performance of speech-to-text conversion on the audio input; and forgoing search for the predefined trigger word, so that the home assistant device avoids being triggered by the home assistant device's own speech or a speech output of a neighboring home assistant device, wherein the speaker recognition uses less resources than the speech recognition. 8. The system of claim 7 , wherein searching for the predefined trigger word in the textual string includes: selecting a respective trigger word that corresponds to the first user from a plurality of preset trigger words that correspond different users among a plurality of users that include the first user; and using the respective trigger word that corresponds to the first user as the predefined trigger word that is to be searched. 9. The system of claim 7 , wherein the operations include: obtaining a default speech-to-text model corresponding to the home assistant device; and in accordance with a determination that a plurality of recorded speech samples provided by the first user are available, adjusting the default speech-to-text model in accordance with the plurality of recorded speech samples provided by the first user to generate a first user-specific speech-to-text model for the first user, wherein performing speech-to-text conversion on the audio input to obtain the textual string includes performing speech-to-text conversion on the audio input using the first user-specific speech-to-text model for the first user. 10. The system of claim 9 , wherein the operations include: in accordance with a determination that a plurality of recorded speech samples provided by the first user are not available, performing the speech-to-text conversion on the audio input using the default speech-to-text model. 11. The system of claim 10 , wherein the operations include: in accordance with a determination that a plurality of recorded speech samples provided by the first user are available, setting a first confidence threshold for recognizing the trigger word in the audio input when the first user-specific speech-to-text model is used to perform the speech-to-text conversion on the audio input; and in accordance with a determination that a plurality of recorded speech samples provided by the first user are not available, setting a second confidence threshold for recognizing the trigger word in the audio input when the default speech-to-text model is used to perform the speech-to-text conversion on the audio input. 12. The system of claim 11 , wherein the first confidence threshold that is used for the first user-specific speech-

Assignees

Midea Group Co Ltd

Inventors

Classifications

H04L12/2816Primary
Controlling appliance services of a home automation network by calling their functionalities (arrangements in telecontrol or telemetry systems for selectively calling a substation from a main station; in which substation desired apparatus is selected for applying a control signal thereto or for obtaining measured values therefrom H04Q9/00) · CPC title
G10L15/08
Speech classification or search · CPC title
G10L17/00
Speaker identification or verification techniques · CPC title
G10L2015/088
Word spotting · CPC title
G10L17/04
Training, enrolment or model building · CPC title

Patent family

Related publications grouped by family.

View patent family 67059754

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10796702B2 cover?: System and method for controlling a home assistant device include: receiving an audio input; performing speaker recognition on the audio input; in accordance with a determination that the audio input includes a voice input from a first user that is authorized to control the home assistant device: performing speech-to-text conversion on the audio input to obtain a textual string; and searching f…
Who is the assignee on this patent?: Midea Group Co Ltd
What technology area does this patent fall under?: Primary CPC classification H04L12/2816. Mapped technology areas include Electricity.
When was this patent published?: Publication date Tue Oct 06 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Voice command processing in low power devices

Using an audio interface device to authenticate another device

Network conference management and arbitration via voice-capturing devices

System and method for updating an adaptive speech recognition model

Frequently asked questions