What technology area does this patent fall under?

Primary CPC classification G10L15/063. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Aug 05 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Training of speech recognition systems

US12380877B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12380877-B2
Application number	US-202117521713-A
Country	US
Kind code	B2
Filing date	Nov 8, 2021
Priority date	Dec 4, 2018
Publication date	Aug 5, 2025
Grant date	Aug 5, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method may include obtaining first audio data of a first communication session between a first and second device and during the first communication session, obtaining a first text string that is a transcription of the first audio data and training a model of an automatic speech recognition system using the first text string and the first audio data. The method may further include in response to completion of the training, deleting the first audio data and the first text string and after deleting the first audio data and the first text string, obtaining second audio data of a second communication session between a third and fourth device and during the second communication session obtaining a second text string that is a transcription of the second audio data and further training the model of the automatic speech recognition system using the second text string and the second audio data.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method comprising: obtaining a model of an automatic speech recognition system; obtaining first audio data of a first communication session between a first device of a first user and a second device of a second user; training a first copy of the model based on the first audio data; obtaining second audio data of a second communication session between a third device of a third user and a fourth device of a fourth user; training a second copy of the model based on the second audio data; determining a set of acoustic parameters using both the trained first copy of the model and the trained second copy of the model; updating the model using the set of acoustic parameters; after updating the model, obtaining third audio data of a third communication session between a fifth device of a fifth user and a sixth device of a sixth user, wherein the third user and the fourth user are both separate and distinct from the first user and the second user and the fifth and sixth users are both separate and distinct from the first, second, third, and fourth users; and generating, during the third communication session, a transcription of the third audio data by applying the updated model. 2. The method of claim 1 , wherein the model includes an acoustic model, a language model, a confidence model, and/or classification model of the automatic speech recognition system. 3. The method of claim 1 , further comprising obtaining a connected graph that includes a plurality of word combinations, the plurality of word combinations derived from the first audio data using automatic speech recognition, wherein the first copy of the model is trained using the connected graph. 4. The method of claim 1 , further comprising obtaining a plurality of phonemes from the first audio data, wherein the first copy of the model is trained using the phonemes. 5. The method of claim 1 , wherein the training of the first copy of the model of the automatic speech recognition system based on the first audio data completes after the first communication session. 6. The method of claim 1 , further comprising in response to completion of the training of the first copy of the model, deleting the first audio data. 7. The method of claim 6 , wherein the first audio data is deleted during the first communication session. 8. The method of claim 1 , wherein the training of the second copy of the model occurs during the training of the first copy of the model. 9. At least one non-transitory computer-readable media configured to store one or more instructions that in response to being executed by at least one computing system cause performance of the method of claim 1 . 10. The method of claim 1 , further comprising determining a classification for the first audio data, the classification indicating an intent of a user when speaking words in the first audio data, wherein the training the model is based on the classification of the first audio data. 11. A system comprising: one or more processors; and one or more computer-readable media configured to store one or more instructions that in response to being executed by the one or more processors cause or direct performance of operations, the operations comprising: obtaining a model of an automatic speech recognition system; obtaining first audio data of a first communication session between a first device of a first user and a second device of a second user; training a first copy of the model based on the first audio data; obtaining second audio data of a second communication session between a third device of a third user and a fourth device of a fourth user; training a second copy of the model based on the second audio data; determining a set of acoustic parameters using both the trained first copy of the model and the trained second copy of the model; updating the model using the set of acoustic parameters; after updating the model, obtaining third audio data of a third communication session between a fifth device of a fifth user and a sixth device of a sixth user, wherein the third user and the fourth user are both separate and distinct from the first user and the second user and the fifth and sixth users are both separate and distinct from the first, second, third, and fourth users; and generating, during the third communication session, a transcription of the third audio data by applying the updated model. 12. The system of claim 11 , wherein the model includes an acoustic model, a language model, a confidence model, and/or classification model of the automatic speech recognition system. 13. The system of claim 11 , wherein the operations further comprise obtaining a connected graph that includes a plurality of word combinations, the plurality of word combinations derived from the first audio data using automatic speech recognition, wherein the first copy of the model is trained using the connected graph. 14. The system of claim 11 , wherein the operations further comprise obtaining a plurality of phonemes from the first audio data, wherein the first copy of the model is trained using the phonemes. 15. The system of claim 11 , wherein the training of the first copy of the model of the automatic speech recognition system based on the first audio data completes after the first communication session. 16. The system of claim 11 , wherein the operations further comprise in response to completion of the training of the first copy of the model, deleting the first audio data. 17. The system of claim 16 , wherein the first audio data is deleted during the first communication session. 18. The system of claim 11 , wherein the training of the second copy of the model occurs during the training of the first copy of the model. 19. The system of claim 11 , wherein the training the model of the automatic speech recognition system based on the first audio data is performed during the first communication session. 20. The system of claim 11 , wherein the operations further comprise: determining a classification for the first audio data, the classification indicating an intent of a user when speaking words in the first audio data, wherein the training the model is based on the classification of the first audio data.

Assignees

Sorenson Ip Holdings Llc

Inventors

Classifications

G10L15/26
Speech to text systems (G10L15/08 takes precedence) · CPC title
G10L2015/0631
Creating reference templates; Clustering · CPC title
G06F21/6245
Protecting personal data, e.g. for financial or medical purposes · CPC title
G10L15/22
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
G10L15/28
Constructional details of speech recognition systems · CPC title

Patent family

Related publications grouped by family.

View patent family 69528938

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12380877B2 cover?: A method may include obtaining first audio data of a first communication session between a first and second device and during the first communication session, obtaining a first text string that is a transcription of the first audio data and training a model of an automatic speech recognition system using the first text string and the first audio data. The method may further include in response …
Who is the assignee on this patent?: Sorenson Ip Holdings Llc
What technology area does this patent fall under?: Primary CPC classification G10L15/063. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Aug 05 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).