Methods, systems and apparatuses for improved speech recognition and transcription
US-11869507-B2 · Jan 9, 2024 · US
US10147417B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10147417-B2 |
| Application number | US-201615284035-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 3, 2016 |
| Priority date | Oct 3, 2016 |
| Publication date | Dec 4, 2018 |
| Grant date | Dec 4, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A speech recognizer performs speech recognition on a spoken name supplied by a user, producing a list of possible matches and corresponding confidence scores, by comparing a packetized voice stream of a spoken name to a plurality of stored phonemes that represent users' text names. If the top scoring match for a spoken name does not correctly identify the spoken name or if the spoken name's confidence score is below a first threshold, the user name is flagged to the system administrator as having a potential speech recognition problem. The results of the speech recognition are used to suggest names whose spelling may need to be adjusted to resolve the speech recognition problem. During production, a low threshold for rejecting speech recognition results can be adjusted downwards for names that produced low scores during testing. Heuristics are presented for re-testing only a subset of names when the set of names is changed.
Opening claim text (preview).
What is claimed is: 1. A system comprising: a microprocessor; and a computer readable medium, coupled with the microprocessor and comprising microprocessor readable and executable instructions that program the microprocessor to: capture a list of text names that represent a plurality of user's names; receive, from a communication endpoint, via a network, a packetized voice stream of a spoken name received from a user of the plurality of users; perform a test recognition on the spoken name against the list of text names to produce a list of possible matches and corresponding confidence scores, wherein the test recognition on the spoken name compares the packetized voice stream of the spoken name to a plurality of portions of electronically stored sound representations comprising phonemes that represent the plurality of user's text names in the list of text names; determine whether a top match correctly identifies the spoken name and the top match's confidence score exceeds a selected first threshold; and in response to determining that the top match does not correctly identify the spoken name or that the top match's confidence score does not exceed the selected first threshold, flag the spoken name as having a recognition problem. 2. The system of claim 1 , wherein the microprocessor readable and executable instructions further program the microprocessor to: receive a new spoken name; determine a confidence score for the new spoken name using the list of text names; and determining, if a top confidence score for the new spoken name is one of: a top confidence score that is above the selected first threshold; below a selected second threshold; not a top confidence score that is above the selected first threshold; or is below the selected first threshold, but above the selected second threshold. 3. The system of claim 2 , wherein a confidence score for the new spoken name is not a top confidence score in the list of possible matches, but is above the selected first threshold and wherein the microprocessor readable and executable instructions further program the microprocessor to generate for display, all text names whose confidence scores are above the selected first threshold. 4. The system of claim 2 , wherein a confidence score for the new spoken name is below the selected first threshold, but above the selected second threshold and wherein the microprocessor readable and executable instructions further program the microprocessor to generate for display, a top name associated with the new spoken name and text names whose confidence scores are below a confidence score associated with the top name for the new spoken name, but above the selected second threshold. 5. The system of claim 2 , wherein a confidence score for the new spoken name is used for an input in one of an email system, a text messaging system, an Instant Messaging (IM) system, a telephone, or a video phone. 6. The system of claim 2 , wherein the second selected threshold is a name-dependent threshold obtained from testing a user's spoken name. 7. The system of claim 1 , wherein the microprocessor readable and executable instructions further program the microprocessor to re-test a subset of the list of text names when a spelling of a name in the list of text names is changed, a new name is added to the list of text names, or an existing name is deleted from the list of text names. 8. The system of claim 7 , wherein names that were not correctly identified by the top match or scored below the first threshold when previously tested are recursively re-tested. 9. The system of claim 7 , wherein other names that appear as possible matches for the added new name are re-tested. 10. The system of claim 7 , wherein names for which the deleted existing name was a possible match when previously tested are re-tested. 11. The system of claim 1 , wherein the microprocessor readable and executable instructions further program the microprocessor to display a set of text names to examine when a name is flagged as having a speech recognition problem. 12. The system of claim 11 , where the speech recognition problem is a same user name with a different pronunciation. 13. A method comprising: capturing, by a microprocessor, a list of text names that represent a plurality of user's names; receiving, by the microprocessor, from a communication endpoint via a network a packetized voice stream of a spoken name received from a user of the plurality of users; perform, by the microprocessor, a test recognition on the spoken name against the list of text names to produce a list of possible matches and corresponding confidence scores, wherein the test recognition on the spoken name compares the packetized voice stream of the spoken name to a plurality of portions of electronically stored sound representations comprising a plurality of phonemes that represent the plurality of user's text names in the list of text names; determining, by the microprocessor, whether a top match correctly identifies the spoken name and the top match's confidence score exceeds a selected first threshold; and in response to determining that the top match does not correctly identify the spoken name or that the top match's confidence score does not exceed the selected first threshold, flagging, by the microprocessor, the spoken name as having a recognition problem. 14. The method of claim 13 , further comprising: receiving, by the microprocessor, a new spoken name; determining, by the microprocessor, a confidence score for the new spoken name using the list of text names; and determining, by the microprocessor, if a top confidence score for the new spoken name is one of: a top confidence score that is above the selected first threshold; below a selected second threshold; not a top confidence score that is above the selected first threshold; or is below the selected first threshold, but above the selected second threshold. 15. The method of claim 14 , wherein a confidence score for the new spoken name is not a top confidence score in the list of possible matches, but is above the selected first threshold and further comprising: generating for display, by the microprocessor, all text names whose confidence scores are above the selected first threshold. 16. The method of claim 14 , wherein a confidence score for the new spoken name is below the selected first threshold, but above the selected second threshold and further comprising: generating for display, by the microprocessor, a top name associated with the new spoken name and text names whose confidence scores are below a confidence score associated with the top name for the new spoken name, but above the selected second threshold. 17. The method of claim 14 , wherein a confidence score for the new spoken name is used for an input in one of an email system, a text messaging system, an Instant Messaging (IM) system, a telephone, or a video phone. 18. The method of claim 13 , further comprising: re-testing, by the microprocessor, a subset of the list of text names when a spelling of a name in the list of text names is changed, a new name is added to the list of text names, or an existing name is deleted from the list of text names. 19. The method of claim 18 , wherein names that were not correctly identified by the top match or scored below the first threshold when previously tested are recursively re-tested. 20. The method of claim 18 , wherein names for which the deleted existing name that are a possible match when previously
for comparison or discrimination · CPC title
Speech recognition (G10L17/00 takes precedence) · CPC title
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
Assessment or evaluation of speech recognition systems · CPC title
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.