System and method for supporting automatic speech recognition of regional accents based on statistical information and user corrections

US10468016B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10468016-B2
Application numberUS-201514950182-A
CountryUS
Kind codeB2
Filing dateNov 24, 2015
Priority dateNov 24, 2015
Publication dateNov 5, 2019
Grant dateNov 5, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed herein is a system for compensating for dialects and accents comprising an automatic speech recognition system comprising an automatic speech recognition device that is operative to receive an utterance in an acoustic format from a user with a user interface; a speech to text conversion engine that is operative to receive the utterance from the automatic speech recognition device and to prepare a textual statement of the utterance; and a correction database that is operative to store textual statements of all utterances; where the correction database is operative to secure a corrected transcript of the textual statement of the utterance from the speech to text conversion engine and adds it to the corrections database if the corrected transcript of the textual statement of the utterance is not available.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for compensating for dialects and accents comprising: an automatic speech recognition system comprising: an automatic speech recognition device that is operative to receive an utterance in an acoustic format from a user with a user interface; a speech to text conversion engine that is operative to receive the utterance from the automatic speech recognition device and to prepare a textual statement of the utterance; a processor, wherein the processor logs a location comprising latitude and longitude data associated with the user and compares the speech to text conversion obtained from the speech to text conversion engine against a dialect database to correct for accents or dialects and wherein the processor correlates the location of the user with phonemic parameters associated with the dialect database to reduce a number of choices of accents or dialects for correction, wherein the processor further obtains location data from the user interface and derives context data for the user thereof, wherein the data includes birthplace locations of the population at the location; and a correction database that is operative to store textual statements of all utterances, wherein the correction database is operative to secure a corrected transcript of the textual statement of the utterance from the speech to text conversion engine and adds it to the corrections database if the corrected transcript of the textual statement of the utterance is not available, wherein in response to an inability to determine a user's accent from context and location based information of the user, the transcript is corrected based on a desired pronunciation of the user; wherein the processor accesses information about the user from one or more directory-based websites to further correlate phonemic parameters associated with the dialect database to further reduce a number of choices of accents or dialects for correction, wherein the one or more directory-based websites contain historical and educational information about the user, wherein the historical and educational information on the directory-based websites is entered onto the directory-based web sites independently from the mobile user, wherein information from the one or more directory-based websites is coupled with the data regarding the population. 2. The system of claim 1 , where the automatic speech recognition device is further operative to communicate with the correction database to determine whether a corrected textual statement of the utterance is available prior to adding the corrected transcript of the textual statement of the utterance to the database. 3. The system of claim 2 , where the automatic speech recognition device is further operative to communicate with the user to send to the user the speech to text conversion obtained from the speech to text conversion engine. 4. The system of claim 3 , where the automatic speech recognition device further comprises a sound transducer for reducing extraneous noise and providing an acoustically accurate signal to the speech to text conversion engine. 5. The system of claim 4 , where the automatic speech recognition device is further operative to query databases to obtain dialect data that provides a correlation between a user's history and his/her dialect or accent. 6. The system of claim 5 , where the speech to text engine provides the user with a text conversion of his speech and where the automatic speech recognition system is operative to afford the user an opportunity to correct the text conversion of his speech. 7. The system of claim 6 , where the user corrected text conversion is added to the correction database along with a tag that contains dialect information. 8. The system of claim 7 , wherein the dialect database stores information defining a dialect. 9. The system of claim 8 , wherein the dialect is defined by at least one of the following: the user's age, the user's gender, a level of education associated with the user, a type of work associated with the user, whether the user is a native speaker of a language associated with the utterance, where the user grew up and where the user currently lives. 10. A method comprising: receiving, by an automatic speech recognition device, an utterance from a mobile user with a user interface; receiving, by a speech to text conversion engine, the utterance from the automatic speech recognition device, wherein the speech to text conversion engine prepares a textual statement of the utterance; logging, by the automatic speech recognition device, a location comprising latitude and longitude data associated with the mobile user and comparing the speech to text conversion obtained from the speech to text conversion engine against a dialect database to correct for accents or dialects; correlating, by the automatic speech recognition device, the location of the mobile user with phonemic parameters associated with the dialect database to reduce a number of choices of accents or dialects for correction; obtaining location data from the user interface and deriving context data for the user thereof, wherein the data includes information wherein the data includes birthplace locations of the population at the location; accessing, by the automatic speech recognition device, information about the mobile user from one or more directory-based websites to further correlate phonemic parameters associated with the dialect database to further reduce a number of choices of accents or dialects for correction, wherein the one or more directory-based websites contain historical and educational information about the user, and wherein the historical and educational information on the directory-based websites is entered onto the directory-based websites independently from the mobile user, wherein information from the one or more directory-based websites is coupled with the data regarding the population; storing, by a correction database, textual statements of all utterances; securing, from the correction database, a corrected transcript of the textual statement of the utterance from the speech to text conversion engine, wherein in response to an inability to determine a user's accent from context and location based information of the user, the transcript is corrected based on a desired pronunciation of the user; and adding the corrected transcript of the textual statement of the utterance from the speech to text conversion engine to the corrections database if the corrected transcript of the textual statement of the utterance is not available. 11. The method of claim 10 , further comprising performing sound filtering for reducing extraneous noise on the user's speech and providing an acoustically accurate signal to the speech to text conversion engine. 12. The method of claim 11 , where the automatic speech recognition further queries databases to obtain dialect data that provides a correlation between a user's history and his/her dialect or accent. 13. The method of claim 12 , further comprising providing the user with a text conversion of his/her speech. 14. The method of claim 13 , further comprising offering the user an opportunity to correct the text conversion of his speech. 15. The method of claim 14 , further comprising adding a user corrected text conversion with a tag that contains dialect information to the correction database. 16. A computer program product comprising: a non-transitory storage medium readable by a processing circuit and storing instructions for execution by the processing circuit for performing a method comprising: receiving an

Assignees

Inventors

Classifications

  • G10L15/26Primary

    Speech to text systems (G10L15/08 takes precedence) · CPC title

  • updating or merging of old and new templates; Mean values; Weighting · CPC title

  • using non-speech characteristics · CPC title

  • G10L15/005Primary

    Language recognition · CPC title

  • Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10468016B2 cover?
Disclosed herein is a system for compensating for dialects and accents comprising an automatic speech recognition system comprising an automatic speech recognition device that is operative to receive an utterance in an acoustic format from a user with a user interface; a speech to text conversion engine that is operative to receive the utterance from the automatic speech recognition device and …
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G10L15/26. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 05 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).