Method and apparatus for discovering trending terms in speech requests
US-2016078860-A1 · Mar 17, 2016 · US
US11170761B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11170761-B2 |
| Application number | US-201816209524-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 4, 2018 |
| Priority date | Dec 4, 2018 |
| Publication date | Nov 9, 2021 |
| Grant date | Nov 9, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method may include obtaining first audio data of a first communication session between a first and second device and during the first communication session, obtaining a first text string that is a transcription of the first audio data and training a model of an automatic speech recognition system using the first text string and the first audio data. The method may further include in response to completion of the training, deleting the first audio data and the first text string and after deleting the first audio data and the first text string, obtaining second audio data of a second communication session between a third and fourth device and during the second communication session obtaining a second text string that is a transcription of the second audio data and further training the model of the automatic speech recognition system using the second text string and the second audio data.
Opening claim text (preview).
The invention claimed is: 1. A method comprising: obtaining first audio data of a first communication session between a first device of a first user and a second device of a second user, the first communication session configured for verbal communication; obtaining, during the first communication session, a first text string that is a transcription of the first audio data; training, during the first communication session, a model of an automatic speech recognition system using the first text string and the first audio data; in response to completion of the training of the model using the first text string and the first audio data, deleting the first audio data and the first text string; after training the model using the first text string and the first audio data, obtaining second audio data of a second communication session between a third device of a third user and a fourth device of a fourth user, wherein the third user and the fourth user are both separate and distinct from the first user and the second user; generating, during the second communication session, a transcription of the second audio data by applying the model trained using the first text string and the first audio data; and providing the transcription of the second audio data to the fourth device for presentation during the second communication session. 2. The method of claim 1 , wherein the model is an acoustic model, a language model, a confidence model, or classification model of the automatic speech recognition system. 3. The method of claim 1 , wherein the first text string is generated using automatic speech recognition technology. 4. The method of claim 3 , wherein the automatic speech recognition technology generates the first text string using a revoicing of the first audio data. 5. The method of claim 1 , wherein the first text string is generated from one or more words of a second text string and one or more words of a third text string, the second text string and the third text string generated by automatic speech recognition technology. 6. The method of claim 1 , wherein the training of the model of the automatic speech recognition system using the first text string and the first audio data completes after the first communication session. 7. The method of claim 1 , wherein the first audio data and the first text string are deleted during the first communication session. 8. The method of claim 1 , further comprising providing the transcription of the first audio data to the second device for presentation by the second device during the first communication session. 9. The method of claim 1 , further comprising: training, during the second communication session using a second text string of the transcription of the second audio data and the second audio data, a second model used by automatic speech recognition technology; and in response to completion of the training of the second model using the second text string and the second audio data, deleting the second audio data and the second text string. 10. At least one non-transitory computer-readable media configured to store one or more instructions that in response to being executed by at least one computing system cause performance of the method of claim 1 . 11. A method comprising: obtaining first audio data of a first communication session between a first device of a first user and a second device of a second user, the first communication session configured for verbal communication; obtaining, during the first communication session, a first text string that is a transcription of the first audio data; training, during the first communication session, a model of an automatic speech recognition system using the first text string and the first audio data; in response to completion of the training of the model using the first text string and the first audio data, deleting the first audio data and the first text string; after deleting the first audio data and the first text string, obtaining second audio data of a second communication session between a third device of a third user and a fourth device of a fourth user, wherein the third user and the fourth user are both separate and distinct from the first user and the second user; obtaining, during the second communication session, a second text string that is a transcription of the second audio data; and further training, during the second communication session, the model of the automatic speech recognition system using the second text string and the second audio data. 12. The method of claim 11 , wherein the model is an acoustic model, a language model, a confidence model, or classification model of the automatic speech recognition system. 13. The method of claim 11 , wherein the first text string is generated using automatic speech recognition technology. 14. The method of claim 13 , wherein the automatic speech recognition technology generates the first text string using a revoicing of the first audio data. 15. The method of claim 11 , wherein the first text string is generated from one or more words of a second text string and one or more words of a third text string, the second text string and the third text string generated by automatic speech recognition technology. 16. The method of claim 11 , wherein the training of the model of the automatic speech recognition system using the first text string and the first audio data completes after the first communication session. 17. The method of claim 11 , wherein the first audio data and the first text string are deleted during the first communication session. 18. The method of claim 11 , further comprising: providing the transcription of the first audio data to the second device for presentation by the second device during the first communication session; and providing the transcription of the second audio data to the fourth device for presentation by the fourth device during the second communication session. 19. The method of claim 11 , wherein the first audio data originates from the first device and is based on captured verbal communications of the first user during the first communication session. 20. At least one non-transitory computer-readable media configured to store one or more instructions that in response to being executed by at least one computing system cause performance of the method of claim 11 .
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
Speech to text systems (G10L15/08 takes precedence) · CPC title
Protecting personal data, e.g. for financial or medical purposes · CPC title
Constructional details of speech recognition systems · CPC title
Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice (G10L15/14 takes precedence) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.