Language models using spoken language modeling
US-2024386885-A1 · Nov 21, 2024 · US
US2026051317A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2026051317-A1 |
| Application number | US-202519285271-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jul 30, 2025 |
| Priority date | Dec 4, 2018 |
| Publication date | Feb 19, 2026 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method may include obtaining first audio data of a first communication session between a first and second device and during the first communication session, obtaining a first text string that is a transcription of the first audio data and training a model of an automatic speech recognition system using the first text string and the first audio data. The method may further include in response to completion of the training, deleting the first audio data and the first text string and after deleting the first audio data and the first text string, obtaining second audio data of a second communication session between a third and fourth device and during the second communication session obtaining a second text string that is a transcription of the second audio data and further training the model of the automatic speech recognition system using the second text string and the second audio data.
Opening claim text (preview).
1 . A method comprising: obtaining first audio data of a first communication session between a first device of a first user and a second device of a second user; training a model of an automatic speech recognition system based on the first audio data; generating, during a second communication session, a transcription of a second audio data by applying the model trained based on the first audio data. 2 . The method of claim 1 , wherein the model is an acoustic model, a language model, a confidence model, or classification model of the automatic speech recognition system. 3 . The method of claim 1 , further comprising obtaining a connected graph that includes a plurality of word combinations, the plurality of word combinations derived from the first audio data using automatic speech recognition, wherein the model is trained using the connected graph. 4 . The method of claim 1 , further comprising obtaining a plurality of phonemes from the first audio data, wherein the model is trained using the phonemes. 5 . The method of claim 1 , further comprising: obtaining a first text string that is a transcription of the first audio data; and determining a classification for the first text string, the classification indicating an intent of a user when speaking words in the first text string, wherein the model is trained using the classification of the first text string and the first text string. 6 . The method of claim 1 , wherein the training of the model of the automatic speech recognition system based on the first audio data completes after the first communication session. 7 . The method of claim 1 , further comprising in response to completion of the training of the model, deleting the first audio data. 8 . The method of claim 7 , wherein the first audio data is deleted during the first communication session. 9 . The method of claim 1 , further comprising: training, during the second communication session based on the second audio data, a second model used by automatic speech recognition technology; and in response to completion of the training of the second model using the second audio data, deleting the second audio data. 10 . At least one non-transitory computer-readable media configured to store one or more instructions that in response to being executed by at least one computing system cause performance of the method of claim 1 . 11 . A system comprising: one or more processors; and one or more computer-readable media configured to store one or more instructions that in response to being executed by the one or more processors cause or direct performance of operations, the operations comprising: obtaining first audio data of a first communication session between a first device of a first user and a second device of a second user; training a model of an automatic speech recognition system based on the first audio data; and generating, during a second communication session, a transcription of a second audio data by applying the model trained based on the first audio data. 12 . The system of claim 11 , wherein the model is an acoustic model, a language model, a confidence model, or classification model of the automatic speech recognition system. 13 . The system of claim 11 , wherein the operations further comprising obtaining a connected graph that includes a plurality of word combinations, the plurality of word combinations derived from the first audio data using automatic speech recognition, wherein the model is trained using the connected graph. 14 . The system of claim 11 , wherein the operations further comprising obtaining a plurality of phonemes from the first audio data, wherein the model is trained using the phonemes. 15 . The system of claim 11 , wherein the operations further comprising: obtaining a first text string that is a transcription of the first audio data; and determining a classification for the first text string, the classification indicating an intent of a user when speaking words in the first text string, wherein the model is trained using the classification of the first text string and the first text string. 16 . The system of claim 11 , wherein the training of the model of the automatic speech recognition system based on the first audio data completes after the first communication session. 17 . The system of claim 11 , wherein the operations further comprising in response to completion of the training of the model, deleting the first audio data. 18 . The system of claim 17 , wherein the first audio data is deleted during the first communication session. 19 . The system of claim 11 , wherein the operations further comprising: training, during the second communication session based on the second audio data, a second model used by automatic speech recognition technology; and in response to completion of the training of the second model using the second audio data, deleting the second audio data. 20 . The system of claim 11 , wherein the training the model of the automatic speech recognition system based on the first audio data is performed during the first communication session.
Speech to text systems (G10L15/08 takes precedence) · CPC title
Creating reference templates; Clustering · CPC title
Protecting personal data, e.g. for financial or medical purposes · CPC title
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
Constructional details of speech recognition systems · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.