Systems, computer-implemented methods, and tangible computer-readable storage media for transcription alignment
US-9305552-B2 · Apr 5, 2016 · US
US10542141B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10542141-B2 |
| Application number | US-201715477954-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 3, 2017 |
| Priority date | Feb 28, 2014 |
| Publication date | Jan 21, 2020 |
| Grant date | Jan 21, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A system and method, the method comprising receiving, at a processor, an hearing user's (HU's) voice signal from an HU participant during a first call, wherein the first call is a communication across a communication network, the processor programmed to perform the steps of, examining at least a portion of the voice signal to determine an identity of the HU, identifying a voice model associated with the identity of the HU, wherein the voice model comprises a plurality of rules, and further wherein each of the plurality of rules is a rule for transcribing a word in the HU voice signal into text and generating a text output, wherein the text output is a transcription of a plurality of words identified in the HU voice signal, and further wherein at least one of the plurality of words identified is identified using at least one rule of the plurality of rules.
Opening claim text (preview).
What is claimed is: 1. A method comprising: receiving, at the processor, a hearing user's (HU's) voice signal from an HU that is participating in a first call with an assisted user (AU) wherein the first call is a communication across a communication network; determining, by the processor, an identity of the HU; training, by the processor, a voice-to-text model associated with the identity of the specific HU that is participating in the first call, wherein the voice-to-text model comprises a plurality of rules, and further wherein each of the plurality of rules is a rule for transcribing a word in the HU's voice signal into text; and generating, by the processor, a text output, wherein the text output is a transcribed version of a plurality of words identified in the HU's voice signal, and further wherein at least one of the plurality of words identified is identified using at least one rule of the plurality of rules; while the first call is ongoing, automatically determining when accuracy of the voice-to-text model has exceeded a threshold level; and wherein, after accuracy of the voice-to-text model has exceeded a threshold level, the text output generated using the voice-to-text model is provided to the AU. 2. The method of claim 1 , wherein a relationship is established between the voice-to-text model and the HU identity. 3. The method of claim 2 , wherein the training the voice-to-text model comprises: receiving, by the processor, a unique identifier, wherein the unique identifier is unique to the HU; and establishing, by the processor, the voice-to-text model by associating the voice-to-text model with the unique identifier. 4. The method of claim 3 , wherein the training the voice-to-text model further comprises establishing the at least one rule for transcribing a word into text, wherein the at least one rule is established by: receiving, by the processor, an input indicating what the word in the HU's voice signal is; and generating, by the processor, the text output using the input. 5. The method of claim 4 , wherein the input is a communication assistant (CA) audio voice signal, and further wherein the processor can accurately recognize a CA word in the CA audio voice signal, and wherein the CA word indicates the word in the HU's voice signal. 6. The method of claim 4 , wherein the input is a sequence of text inputs received by the processor from a user interface, wherein the sequence of text inputs comprises an indication of letters that form the word in the HU's voice signal. 7. The method of claim 1 wherein the plurality of rules are trained over more than one call involving the HU. 8. The method of claim 1 wherein the step of determining an identity of the HU includes identifying a voice profile associated with the HU. 9. The method of claim 1 wherein the communication network is a phone network. 10. The method of claim 1 , wherein determining the identity of the HU comprises receiving, by the processor, a unique identifier, wherein the unique identifier is used to determine the identity of the HU. 11. The method of claim 10 , wherein the unique identifier comprises at least one of a phone number of the first call participant and an auditory vocal identity of the first call participant. 12. The method of claim 1 , further comprising sending the text output to a visual display. 13. The method of claim 12 , wherein the visual display is viewable by the AU participating in the first call. 14. The method of claim 1 , further comprising receiving, by the processor, a request for communication assistant (CA) assistance from the AU, wherein the request indicates that the AU desires a CA to input the HU's voice signal to generate the text output. 15. The method of claim 14 , wherein the request for CA assistance comprises an input received by the processor from a user interface. 16. A system comprising: a memory; a processor coupled to the memory, wherein the processor is configured to: receive a hearing user's (HU's) voice signal from an HU that is participating in a first call with an assisted user (AU) wherein the first call is a communication across a communication network; determine an identity of the HU; train a voice-to-text model associated with the identity of the specific HU that is participating in the first call, wherein the voice-to-text model comprises a plurality of rules, and further wherein each of the plurality of rules is a rule for transcribing a word in the HU's voice signal into text; and generate a text output, wherein the text output is a transcribed version of a plurality of words identified in the HU's voice signal, and further wherein at least one of the plurality of words identified is identified using at least one rule of the plurality of rules; while the first call is ongoing, automatically determine when accuracy of the voice-to-text model has exceeded a threshold level; and wherein, after accuracy of the voice-to-text model has exceeded a threshold level, the text output generated using the voice-to-text model is provided to the AU. 17. A software program stored in a memory for execution by a processor, the program causing the processor to: receive a hearing user's (HU's) voice signal from an HU that is participating in a first call with an assisted user (AU) wherein the first call is a communication across a communication network; determine an identity of the HU; train a voice-to-text model associated with the identity of the specific HU that is participating in the first call, wherein the voice-to-text model comprises a plurality of rules, and further wherein each of the plurality of rules is a rule for transcribing a word in the HU's voice signal into text; and generate a text output, wherein the text output is a transcribed version of a plurality of words identified in the HU's voice signal, and further wherein at least one of the plurality of words identified is identified using at least one rule of the plurality of rules; while the first call is ongoing, automatically determine when accuracy of the voice-to-text model has exceeded a threshold level; and wherein, after accuracy of the voice-to-text model has exceeded a threshold level, the text output generated using the voice-to-text model is provided to the AU. 18. A method comprising: storing at least one voice recognition profile and at least one associated voice model for converting a voice signal to text in a memory device that is linked to a processor, the voice model including a plurality of rules, at least a subset of the rules for transcribing words in a voice signal into text; receiving, at the processor, a hearing user's (HU's) voice signal from an HU that is participating in a first call with an assisted user (AU) wherein the first call is a communication across a communication network; the processor programmed to perform the steps of: comparing the HU voice signal to at least a subset of the voice recognition profiles to identify a specific voice recognition profile that corresponds to the HU voice signal; accessing the voice model associated with the identified specific voice recognition profile; and generating a text output, wherein the text output is a transcribed version of a plurality of words identified in the audio voice signal of the HU, and further wherein at least one of the plurality of words identified is identified using at least one rule of the plurality of rules. 19. The method of claim 18 , further comprising establishing, by the processor, the voice model for the HU during cal
Transforming into visible information · CPC title
Comparators · CPC title
using speech recognition · CPC title
Speech to text systems (G10L15/08 takes precedence) · CPC title
Cordless telephones (user interfaces specially adapted therefor H04M1/724) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.