Real-time speech analysis and method using speech recognition and comparison with standard pronunciation

US10586556B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10586556-B2
Application numberUS-201313930546-A
CountryUS
Kind codeB2
Filing dateJun 28, 2013
Priority dateJun 28, 2013
Publication dateMar 10, 2020
Grant dateMar 10, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method of providing real-time speech analysis for a user includes capturing a speech input, performing a real-time recognition of the speech input including converting the speech input to a text using an automatic speech recognition component, analyzing the recognized speech input, by a processing unit of a computer in a speech recognition and analyzing system, to identify an error in the user's speech, and by comparing a voice of a correct text generated by a speech generation and analyzing system with the captured speech input, and providing a real-time correction to the user based on a result of the comparing the voice of the correct text with the captured speech input. The comparing the voice of the correct text with the captured speech input includes comparing a standard pronunciation of the correct text with a pronunciation of the user in the captured speech input to identify the error.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of providing real-time speech analysis for a user, said method comprising: capturing a speech input; performing a real-time recognition of the speech input including converting the speech input to a text using an automatic speech recognition component; analyzing the recognized speech input, by a processing unit of a computer in a speech recognition and analyzing system, to identify an error in the user's speech, and by comparing a voice of a correct text generated by a speech generation and analyzing system with the captured speech input; and providing a real-time correction to the user based on a result of the comparing the voice of the correct text with the captured speech input, wherein said speech input comprises speech from the user, and at least one additional speaker, and wherein the comparing the voice of the correct text with the captured speech input includes comparing a standard pronunciation of the correct text with a pronunciation of the user in the captured speech input to identify the error in the user's speech. 2. The method according to claim 1 , further comprising: processing the text to extract a contextual conversation cue. 3. The method according to claim 2 , wherein the contextual conversation cue is used to detect at least one of a candidate sound, a candidate word and a candidate phrase, for correction. 4. The method according to claim 1 , further comprising: extracting errors made by the user; summarizing frequent error patterns with the help of a machine learning algorithm; and storing, in a user profile, at least one of the errors made by the user and the frequent error patterns. 5. The method according to claim 4 , wherein said user profile comprises at least one of a user nationality, a user accent, and a user history, said user history comprising at least one of an analyzed user speech, a previous response to an identified error, a previous user feedback, and a user error tolerance preference. 6. The method according to claim 1 , wherein said possible the error comprises at least one of a mispronunciation, a syntactical error, and a grammatical error. 7. The method according to claim 1 , wherein said analyzing comprises a conversational semantic analysis in which phonetics of the correct text are compared with phonetics of an original pronunciation in the captured speech input, and the correct text is sent through the speech generation and analyzing system to compare a speech output from the speech generation and analyzing system with the original pronunciation in the captured speech input. 8. The method according to claim 1 , wherein said performing real-time recognition comprises using a speech cue from at least one additional speaker. 9. The method according to claim 1 , wherein said potential error is identified by using a contextual conversation cue. 10. The method according to claim 1 , further comprising: outputting to the user at least one of an identified error, a visual correction, an audible correction, and a suggested synonym. 11. The method according to claim 1 , further comprising: processing the text to extract contextual dialog prompts, wherein the contextual dialog prompts detect at least one of a candidate sound, a candidate word, and a candidate phrase for the real-time correction. 12. The method according to claim 1 , wherein the real-time correction is provided while the user is not actively using the speech recognition and analyzing system. 13. The method according to claim 1 , wherein the real-time correction is provided without an active practicing of the user. 14. A non-transitory computer-readable storage medium tangibility embodying a program of machine-readable instructions executable by a digital processing apparatus to perform the method according to claim 1 . 15. A system for providing real-time speech analysis, said system comprising: a capture component for capturing a speech input; an Automatic Speech Recognition (ASR) component for performing real-time recognition of the speech input including converting the speech input to a text; an analysis component for analyzing the recognized speech input to identify an error, and by comparing a voice of a correct text generated by the analysis component with the captured speech input, wherein a real-time correction is provided to a user based on a result of the comparing the voice of the correct text with the captured speech input; and a lesson planner component for arranging at least one of a pre-defined lesson and an automatically created lesson, wherein the comparing the voice of the correct text with the captured speech input includes comparing a standard pronunciation of the correct text with a pronunciation of the user in the captured speech input to identify the error. 16. The system according to claim 15 , wherein said analysis component generates a predicted speech meaning based on said speech input. 17. The system according to claim 16 , wherein said error is identified by comparing said predicted speech meaning to said speech input. 18. The system according to claim 15 , wherein said analysis component analyzes the recognized speech input by using a conversation in which phonetics of the correct text are compared with phonetics of an original pronunciation in the captured speech input, and the correct text is sent through an automatic speech generation system (ASG) to compare a speech output from the ASG with the original pronunciation in the captured speech input. 19. The system according to claim 15 , further comprising: an error summary component for determining one or more error patterns. 20. The system according to claim 15 , further comprising: a user profile component which stores at least one of an error summary, and a user error pattern. 21. The system according to 15 , wherein said capturing comprises at least one of continuously monitoring said speech input and continuously receiving said speech input. 22. The system according to claim 15 , wherein the error is made by the user. 23. A method for providing a real time speech correction in a conversation context, the method comprising: using an automatic speech recognition (ASR) system to convert speech of a plurality of speakers to a text, said plurality of speakers including a user; processing the text to extract a contextual conversation cue; using said cue to detect at least one of a candidate sound, a candidate word, and a candidate phrase; comparing a candidate list with information from a user profile by comparing a voice of a correct text generated by a speech generation and analysis system with a voice of the user inputted to the ASR system; using a comparison result of the voice of the correct text with the captured speech input to suggest at least one of a real-time correction and a synonym; and informing the user through at least one of an audio feedback, a graphical feedback, and a textual feedback of said at least one of said correction and said synonym, wherein said speech input comprises speech from the user, and at least one additional speaker, and wherein the comparing the voice of the correct text with the voice of the user includes comparing a standard pronunciation of the correct text with a pronunciation of the user in the captured speech input to identify an error in the user's speech.

Assignees

Inventors

Classifications

  • Speaking (with audible presentation of the material to be studied G09B5/04) · CPC title

  • Foreign languages (with audible presentation of material to be studied G09B5/04) · CPC title

  • Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning · CPC title

  • Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

  • G10L25/48Primary

    specially adapted for particular use · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10586556B2 cover?
A method of providing real-time speech analysis for a user includes capturing a speech input, performing a real-time recognition of the speech input including converting the speech input to a text using an automatic speech recognition component, analyzing the recognized speech input, by a processing unit of a computer in a speech recognition and analyzing system, to identify an error in the use…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G10L25/48. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 10 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).