Method and system for controlling home assistant devices
US-10796702-B2 · Oct 6, 2020 · US
US11348588B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11348588-B2 |
| Application number | US-201916545511-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 20, 2019 |
| Priority date | Aug 20, 2018 |
| Publication date | May 31, 2022 |
| Grant date | May 31, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An electronic device for performing speech recognition and a method therefor are provided. The method includes detecting a first text, which is preset for performing speaker recognition, by performing speech recognition on a first speech signal, performing speaker recognition on a second speech signal acquired after the first speech signal, based on the first text being detected, and executing a voice command obtained from the second speech signal, based on a result of performing the speaker recognition on the second speech signal indicating that a speaker of the second speech signal corresponds to a first speaker who registered the first text.
Opening claim text (preview).
What is claimed is: 1. A method of performing speech recognition by an electronic device, the method comprising: detecting a first text, which is preset for performing speaker recognition, by performing speech recognition on a first speech signal; obtaining information of a first speaker corresponding the first text; performing speaker recognition based on a second speech signal acquired subsequently to the first speech signal and the information of the first speaker; and executing a function corresponding to the second speech signal, based on a speaker of the second speech signal corresponding to the first speaker by a result of the speaker recognition. 2. The method of claim 1 , wherein the performing of the speaker recognition on the second speech signal comprises: acquiring, based on the first speech signal, a speech signal interval in which the first text is uttered, performing the speaker recognition on the speech signal interval, and performing the speaker recognition on the second speech signal, based on a result of performing the speaker recognition on the speech signal interval indicating that a speaker of the speech signal interval corresponds to the first speaker. 3. The method of claim 2 , wherein the function corresponding to the second speech signal is executed based on whether a degree of correspondence between the speaker of the second speech signal and the first speaker is greater than or equal to a first reference value, wherein the speaker recognition is performed on the second speech signal based on whether a degree of correspondence between the speaker of the speech signal interval and the first speaker is greater than or equal to a second reference value, and wherein the first reference value is greater than the second reference value. 4. The method of claim 1 , wherein the detecting of the first text comprises: performing named entity recognition on a text obtained by performing the speech recognition on the first speech signal, extracting a named entity representing the first speaker from the text by performing the named entity recognition, and detecting, as the first text, the named entity representing the first speaker. 5. The method of claim 1 , wherein the second speech signal includes a speech signal acquired during a preset time period after acquisition of the first speech signal. 6. The method of claim 1 , further comprising: ignoring the function corresponding to the second speech signal, based on a result of performing the speaker recognition on the second speech signal indicating that the speaker of the second speech signal does not correspond to the first speaker who registered the first text. 7. The method of claim 1 , further comprising: detecting a second text, which is preset for performing speaker recognition, by performing speech recognition on a third speech signal acquired after the first speech signal; determining an order of priority for the first speaker and a second speaker who registered the second text, based on a result of performing the speaker recognition on a fourth speech signal indicating that a speaker of the fourth speech signal acquired after the third speech signal corresponds to the second speaker; and executing, based on at least one of the determined order of priority, or a voice command obtained from the fourth speech signal. 8. The method of claim 1 , wherein the performing of the speaker recognition of the first speech signal and the second speech signal is based on analyzing signal characteristics comprising a waveform, a frequency, and an amplitude of the first speech signal and the second speech signal. 9. An electronic device for performing speech recognition, the electronic device comprising: a microphone configured to receive first and second speech signals; and at least one processor configured to: detect a first text, which is preset for performing speaker recognition, by performing speech recognition on the first speech signal, obtain information of a first speaker corresponding the first text, perform speaker recognition based on the second speech signal acquired subsequently to the first speech signal and the information of the first speaker, and execute a function corresponding to the second speech signal, based on a speaker of the second speech signal corresponding to the first speaker by a result of the speaker recognition. 10. The electronic device of claim 9 , wherein the at least one processor is further configured to: acquire, based on the first speech signal, a speech signal interval in which the first text is uttered, perform the speaker recognition on the speech signal interval, and perform the speaker recognition on the second speech signal, based on a result of performing the speaker recognition on the speech signal interval indicating that a speaker of the speech signal interval corresponds to the first speaker. 11. The electronic device of claim 10 , wherein the at least one processor is further configured to: execute the function corresponding to the second speech signal based on whether a degree of correspondence between the speaker of the second speech signal and the first speaker is greater than or equal to a first reference value, and perform the speaker recognition on the second speech signal based on whether a degree of correspondence between the speaker of the speech signal interval and the first speaker is greater than or equal to a second reference value, wherein the first reference value is greater than the second reference value. 12. The electronic device of claim 10 , wherein the at least one processor is further configured to: perform named entity recognition on a text obtained by performing the speech recognition on the first speech signal, extract a named entity representing the first speaker from the text by performing the named entity recognition, and detect, as the first text, the named entity representing the first speaker. 13. The electronic device of claim 10 , wherein the second speech signal comprises a speech signal acquired during a preset time period after acquisition of the first speech signal. 14. The electronic device of claim 10 , wherein the at least one processor is further configured to: ignore the function corresponding to the second speech signal, based on a result of performing the speaker recognition on the second speech signal indicating that the speaker of the second speech signal does not correspond to the first speaker who registered the first text. 15. The electronic device of claim 10 , wherein the at least one processor is further configured to: detect a second text, which is preset for performing speaker recognition, by performing speech recognition on a third speech signal acquired after the first speech signal, determine an order of priority for the first speaker and a second speaker who registered the second text, based on a result of performing the speaker recognition on the second speech signal indicating that a speaker of a fourth speech signal acquired after the third speech signal corresponds to the second speaker, and execute a voice command obtained from the fourth speech signal, based on the determined order of priority. 16. A computer program product comprising a non-transitory computer readable recording medium having recorded thereon a plurality of instructions, which when executed by at least one processor, instruct the at least one processor to perform: detecting a first text, which is preset for performing speaker recognition, by performing speech recognition on a first speech signal; obtaini
Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title
Decision making techniques; Pattern matching strategies · CPC title
Interactive procedures; Man-machine interfaces · CPC title
Named entity recognition · CPC title
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.