Electronic device and operation method for performing speech recognition

US11348588B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11348588-B2
Application numberUS-201916545511-A
CountryUS
Kind codeB2
Filing dateAug 20, 2019
Priority dateAug 20, 2018
Publication dateMay 31, 2022
Grant dateMay 31, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An electronic device for performing speech recognition and a method therefor are provided. The method includes detecting a first text, which is preset for performing speaker recognition, by performing speech recognition on a first speech signal, performing speaker recognition on a second speech signal acquired after the first speech signal, based on the first text being detected, and executing a voice command obtained from the second speech signal, based on a result of performing the speaker recognition on the second speech signal indicating that a speaker of the second speech signal corresponds to a first speaker who registered the first text.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of performing speech recognition by an electronic device, the method comprising: detecting a first text, which is preset for performing speaker recognition, by performing speech recognition on a first speech signal; obtaining information of a first speaker corresponding the first text; performing speaker recognition based on a second speech signal acquired subsequently to the first speech signal and the information of the first speaker; and executing a function corresponding to the second speech signal, based on a speaker of the second speech signal corresponding to the first speaker by a result of the speaker recognition. 2. The method of claim 1 , wherein the performing of the speaker recognition on the second speech signal comprises: acquiring, based on the first speech signal, a speech signal interval in which the first text is uttered, performing the speaker recognition on the speech signal interval, and performing the speaker recognition on the second speech signal, based on a result of performing the speaker recognition on the speech signal interval indicating that a speaker of the speech signal interval corresponds to the first speaker. 3. The method of claim 2 , wherein the function corresponding to the second speech signal is executed based on whether a degree of correspondence between the speaker of the second speech signal and the first speaker is greater than or equal to a first reference value, wherein the speaker recognition is performed on the second speech signal based on whether a degree of correspondence between the speaker of the speech signal interval and the first speaker is greater than or equal to a second reference value, and wherein the first reference value is greater than the second reference value. 4. The method of claim 1 , wherein the detecting of the first text comprises: performing named entity recognition on a text obtained by performing the speech recognition on the first speech signal, extracting a named entity representing the first speaker from the text by performing the named entity recognition, and detecting, as the first text, the named entity representing the first speaker. 5. The method of claim 1 , wherein the second speech signal includes a speech signal acquired during a preset time period after acquisition of the first speech signal. 6. The method of claim 1 , further comprising: ignoring the function corresponding to the second speech signal, based on a result of performing the speaker recognition on the second speech signal indicating that the speaker of the second speech signal does not correspond to the first speaker who registered the first text. 7. The method of claim 1 , further comprising: detecting a second text, which is preset for performing speaker recognition, by performing speech recognition on a third speech signal acquired after the first speech signal; determining an order of priority for the first speaker and a second speaker who registered the second text, based on a result of performing the speaker recognition on a fourth speech signal indicating that a speaker of the fourth speech signal acquired after the third speech signal corresponds to the second speaker; and executing, based on at least one of the determined order of priority, or a voice command obtained from the fourth speech signal. 8. The method of claim 1 , wherein the performing of the speaker recognition of the first speech signal and the second speech signal is based on analyzing signal characteristics comprising a waveform, a frequency, and an amplitude of the first speech signal and the second speech signal. 9. An electronic device for performing speech recognition, the electronic device comprising: a microphone configured to receive first and second speech signals; and at least one processor configured to: detect a first text, which is preset for performing speaker recognition, by performing speech recognition on the first speech signal, obtain information of a first speaker corresponding the first text, perform speaker recognition based on the second speech signal acquired subsequently to the first speech signal and the information of the first speaker, and execute a function corresponding to the second speech signal, based on a speaker of the second speech signal corresponding to the first speaker by a result of the speaker recognition. 10. The electronic device of claim 9 , wherein the at least one processor is further configured to: acquire, based on the first speech signal, a speech signal interval in which the first text is uttered, perform the speaker recognition on the speech signal interval, and perform the speaker recognition on the second speech signal, based on a result of performing the speaker recognition on the speech signal interval indicating that a speaker of the speech signal interval corresponds to the first speaker. 11. The electronic device of claim 10 , wherein the at least one processor is further configured to: execute the function corresponding to the second speech signal based on whether a degree of correspondence between the speaker of the second speech signal and the first speaker is greater than or equal to a first reference value, and perform the speaker recognition on the second speech signal based on whether a degree of correspondence between the speaker of the speech signal interval and the first speaker is greater than or equal to a second reference value, wherein the first reference value is greater than the second reference value. 12. The electronic device of claim 10 , wherein the at least one processor is further configured to: perform named entity recognition on a text obtained by performing the speech recognition on the first speech signal, extract a named entity representing the first speaker from the text by performing the named entity recognition, and detect, as the first text, the named entity representing the first speaker. 13. The electronic device of claim 10 , wherein the second speech signal comprises a speech signal acquired during a preset time period after acquisition of the first speech signal. 14. The electronic device of claim 10 , wherein the at least one processor is further configured to: ignore the function corresponding to the second speech signal, based on a result of performing the speaker recognition on the second speech signal indicating that the speaker of the second speech signal does not correspond to the first speaker who registered the first text. 15. The electronic device of claim 10 , wherein the at least one processor is further configured to: detect a second text, which is preset for performing speaker recognition, by performing speech recognition on a third speech signal acquired after the first speech signal, determine an order of priority for the first speaker and a second speaker who registered the second text, based on a result of performing the speaker recognition on the second speech signal indicating that a speaker of a fourth speech signal acquired after the third speech signal corresponds to the second speaker, and execute a voice command obtained from the fourth speech signal, based on the determined order of priority. 16. A computer program product comprising a non-transitory computer readable recording medium having recorded thereon a plurality of instructions, which when executed by at least one processor, instruct the at least one processor to perform: detecting a first text, which is preset for performing speaker recognition, by performing speech recognition on a first speech signal; obtaini

Assignees

Inventors

Classifications

  • Audio in a user interface, e.g. using voice commands for navigating, audio feedback · CPC title

  • Decision making techniques; Pattern matching strategies · CPC title

  • G10L17/22Primary

    Interactive procedures; Man-machine interfaces · CPC title

  • Named entity recognition · CPC title

  • Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11348588B2 cover?
An electronic device for performing speech recognition and a method therefor are provided. The method includes detecting a first text, which is preset for performing speaker recognition, by performing speech recognition on a first speech signal, performing speaker recognition on a second speech signal acquired after the first speech signal, based on the first text being detected, and executing …
Who is the assignee on this patent?
Samsung Electronics Co Ltd
What technology area does this patent fall under?
Primary CPC classification G10L17/22. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 31 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).