Speech recognition using electronic device and server

US9640183B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9640183-B2
Application numberUS-201514680444-A
CountryUS
Kind codeB2
Filing dateApr 7, 2015
Priority dateApr 7, 2014
Publication dateMay 2, 2017
Grant dateMay 2, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An electronic device is provided. The electronic device includes a processor configured to perform automatic speech recognition (ASR) on a speech input by using a speech recognition model that is stored in a memory and a communication module configured to provide the speech input to a server and receive a speech instruction, which corresponds to the speech input, from the server. The electronic device may perform different operations according to a confidence score of a result of the ASR. Besides, it may be permissible to prepare other various embodiments speculated through the specification.

First claim

Opening claim text (preview).

What is claimed is: 1. An electronic device comprising: a processor configured to perform automatic speech recognition (ASR) on a speech input by using a speech recognition model that is stored in a memory; and a communication module configured to provide the speech input to a server and receive a speech instruction, which corresponds to the speech input, from the server, wherein the processor is further configured to: perform an operation corresponding to a result of the ASR if a confidence score of the result of the ASR is higher than a first threshold value, perform the speech instruction, which is received from the server, if the confidence score is between the first threshold value and a second threshold value, and decrease the first threshold value if the result of the ASR corresponds to the speech instruction that is received from the server. 2. The electronic device of claim 1 , wherein the processor is further configured to provide a feedback if the confidence score of the result of the ASR is lower than the second threshold value. 3. The electronic device of claim 1 , wherein, if the confidence score is higher than the first threshold value, the processor is configured to perform the operation regardless of receipt of the speech instruction from the server. 4. The electronic device of claim 3 , wherein the performing of the operation includes performing at least one function executable by the processor, at least one application, or at least one input based on the result of the ASR. 5. The electronic device of claim 1 , wherein the providing of the feedback comprises providing a message or audio output to indicate that the speech input has not been recognized or there is low confidence in the result of the ASR. 6. The electronic device of claim 1 , wherein the speech instruction received from the server corresponds to a result of speech recognition to the provided speech input, which is performed at the server, based on a speech recognition model different from the speech recognition model stored in the memory. 7. The electronic device of claim 6 , wherein the speech recognition performed in the server is configured to comprise natural language processing (NLP). 8. The electronic device of claim 1 , wherein the processor is further configured to: provide an audio signal, in which a pre-processing is applied to the speech input, to an ASR engine performing the ASR; and provide the speech input itself to the server through the communication module. 9. The electronic device of claim 1 , wherein the processor is further configured to: if the confidence score is higher than the first threshold value, compare the result of the ASR to the speech instruction received from the server; and change the first threshold value based on a result of the comparison. 10. The electronic device of claim 1 , wherein the processor is further configured to increase the first threshold value if the result of the ASR does not correspond to the speech instruction. 11. The electronic device of claim 1 , wherein the processor is further configured to: if the confidence score is lower than the first threshold value, compare the result of the ASR to the speech instruction received from the server; and update the speech recognition model based on a result of the comparison. 12. The electronic device of claim 11 , wherein the communication module is further configured to receive a confidence score with the speech instruction from the server, wherein the processor is further configured to add the speech instruction and the confidence score of the speech instruction, for the speech input, to the speech recognition model. 13. A method of executing speech recognition in an electronic device, the method comprising: obtaining a speech input from a user; generating a speech signal corresponding to the obtained speech; performing first speech recognition on at least a part of the speech signal; acquiring first operation information and a first confidence score; transmitting at least a part of the speech signal to a server for second speech recognition; receiving second operation information, which corresponds to the transmitted signal, from the server; performing a function corresponding to the first operation information if the first confidence score is higher than a first threshold value; performing a function corresponding to the second operation information if the first confidence score is between the first threshold value and a second threshold value; and decreasing the first threshold value if the function corresponding to the first operation information is identical to the function corresponding to the second operation information. 14. The method of claim 13 , wherein, if the first confidence score is higher than the first threshold value, the performing of the function corresponding to the first operation information is performed before the receiving of the second operation information. 15. The method of claim 13 , further comprising: providing a feedback if the first confidence score is lower than the second threshold value. 16. The method of claim 13 , further comprising: increasing the first threshold if the function corresponding to the first operation information is different from the function corresponding to the second operation information. 17. The method of claim 13 , wherein the receiving of the second operation information further includes receiving a second confidence score of the second operation information. 18. The method of claim 17 , further comprising: if the first confidence score is lower than the first threshold value, adding the second operation information and the second confidence score, for the speech input, to a speech recognition model that is used in the first speech recognition. 19. The method of claim 17 , further comprising: if the first operation information corresponds to the second operation information, adding the second operation information and the second confidence score to a speech recognition model, which is used in the first speech recognition, based on the first confidence score and second confidence score. 20. A non-transitory computer readable recording medium having instructions recorded thereon, the instructions implement a method of executing speech recognition in an electronic device, the method comprising: obtaining a speech input from a user; generating a speech signal corresponding to the obtained speech; performing first speech recognition on at least a part of the speech signal; acquiring first operation information and a first confidence score; transmitting at least a part of the speech signal to a server for second speech recognition; receiving second operation information, which corresponds to the transmitted signal, from the server; performing a function corresponding to the first operation information if the first confidence score is higher than a first threshold value; performing a function corresponding to the second operation information if the first confidence score is between the first threshold value and a second threshold value; and decreasing the first threshold value if the function corresponding to the first operation information is identical to the function corresponding to the second operation information.

Assignees

Inventors

Classifications

  • Feedback of the input speech · CPC title

  • G10L17/22Primary

    Interactive procedures; Man-machine interfaces · CPC title

  • Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems · CPC title

  • G10L15/30Primary

    Distributed recognition, e.g. in client-server systems, for mobile phones or network applications · CPC title

  • Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9640183B2 cover?
An electronic device is provided. The electronic device includes a processor configured to perform automatic speech recognition (ASR) on a speech input by using a speech recognition model that is stored in a memory and a communication module configured to provide the speech input to a server and receive a speech instruction, which corresponds to the speech input, from the server. The electronic…
Who is the assignee on this patent?
Samsung Electronics Co Ltd
What technology area does this patent fall under?
Primary CPC classification G10L17/22. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 02 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).