What technology area does this patent fall under?

Primary CPC classification G10L15/193. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jul 18 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 11 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Language and grammar model adaptation using model weight data

US11705116B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11705116-B2
Application number	US-202117405677-A
Country	US
Kind code	B2
Filing date	Aug 18, 2021
Priority date	Mar 25, 2019
Publication date	Jul 18, 2023
Grant date	Jul 18, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods described herein relate to adapting a language model for automatic speech recognition (ASR) for a new set of words. Instead of retraining the ASR models, language models and grammar models, the system only modifies one grammar model and ensures its compatibility with the existing models in the ASR system.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method, comprising: receiving input data representing a first set of words; determining a first grammar model corresponding to a second set of words that are different than the first set of words, the first grammar model including first model weight data; processing the first model weight data using the input data to determine second model weight data; determining first data representing a difference between the first model weight data and the second model weight data; using the first data to generate second data corresponding to language processing with regard to a third set of words, wherein at least one word of the third set of words is included in the first set of words but not in the second set of words; and storing an association between the second data and the first grammar model. 2. The computer-implemented method of claim 1 , further comprising: generating updated data associated with the first grammar model to indicate the association between the second data and the first grammar model. 3. The computer-implemented method of claim 1 , further comprising: determining acoustic model output data representative of an acoustic model corresponding to the second set of words; processing the acoustic model output data using the first grammar model to determine first grammar model output data, the first grammar model including data representing the association between the second data and the first grammar model; and processing the first grammar model output data using the first data. 4. The computer-implemented method of claim 1 , further comprising: at a first time period: determining acoustic model output data representative of an acoustic model corresponding to the second set of words, and storing the acoustic model output data in a data structure; and at a second time period after the first time period: receiving the acoustic model output data from the data structure, and determining the first model weight data associated with the first grammar model using the acoustic model output data. 5. The computer-implemented method of claim 1 , further comprising: receiving input audio data representing an utterance; processing the input audio data using an acoustic model to generate first output data; processing the first output data using the first grammar model and the association between the second data and the first grammar model to determine second output data representing a likelihood that the input audio data includes a word from the third set of words; and processing the second output data using the first data to determine at least one score for the input audio data, the at least one score indicating a probability that the input audio data represents a word from the third set of words. 6. The computer-implemented method of claim 1 , further comprising: determining, using the first grammar model and the first data, the third set of words representing a difference between a first vocabulary represented by the first grammar model and a second vocabulary represented by the first data; and generating the second data using the third set of words. 7. The computer-implemented method of claim 1 , further comprising: determining third model weight data corresponding to the second data using the first model weight data and fourth model weight data associated with the first data. 8. The computer-implemented method of claim 1 , further comprising: receiving input audio data representing an utterance; determining user profile data associated with the input audio data; determining a fourth set of words using the user profile data; generating third data corresponding to the fourth set of words; storing a second association between the first grammar model with the third data; processing the input audio data using an acoustic model to generate first output data; processing the first output data using the first grammar model and data representing the second association to determine second output data representing a likelihood that the input audio data includes a word from the third set of words and the fourth set of words; and processing the second output data using the first data to determine at least one score for the input audio data, the at least one score indicating a probability that the input audio data represents a word from the third set of words and the fourth set of words. 9. The computer-implemented method of claim 1 , wherein the second data corresponds to a finite state transducer (FST). 10. A system comprising: at least one processor; and at least one memory comprising instructions that, when executed by the at least one processor, cause the system to: receive input data representing a first set of words; determine a first grammar model corresponding to a second set of words that are different than the first set of words, the first grammar model including first model weight data; process the first model weight data using the input data to determine second model weight data; determine first data representing a difference between the first model weight data and the second model weight data; use the first data to generate second data corresponding to language processing with regard to a third set of words, wherein at least one word of the third set of words is included in the first set of words but not in the second set of words; and store an association between the second data and the first grammar model. 11. The system of claim 10 , wherein the at least one memory further comprises instructions that, when executed by the at least one processor, further cause the system to: generate updated data associated with the first grammar model to indicate the association between the second data and the first grammar model. 12. The system of claim 10 , wherein the at least one memory further comprises instructions that, when executed by the at least one processor, further cause the system to: determine acoustic model output data representative of an acoustic model corresponding to the second set of words; process the acoustic model output data using the first grammar model to determine first grammar model output data, the first grammar model including data representing the association between the second data and the first grammar model; and process the first grammar model output data using the first data. 13. The system of claim 10 , wherein the at least one memory further comprises instructions that, when executed by the at least one processor, further cause the system to: at a first time period: determine acoustic model output data representative of an acoustic model corresponding to the second set of words; and store the acoustic model output data in a data structure; and at a second time period after the first time period: receive the acoustic model output data from the data structure; and determine the first model weight data associated with the first grammar model using the acoustic model output data. 14. The system of claim 10 , wherein the at least one memory further comprises instructions that, when executed by the at least one processor, further cause the system to: receive input audio data representing an utterance; process the input audio data using an acoustic model to generate first output data; process the first output data using the first grammar model and the association between the second data and the first grammar model to determine second output data representing a likelihood that the input audio data includes a word from the third set of words; and process the second output data using the first

Assignees

Amazon Tech Inc

Inventors

Classifications

G10L15/193Primary
Formal grammars, e.g. finite state automata, context free grammars or word networks · CPC title
G10L15/22
Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title
G10L15/30
Distributed recognition, e.g. in client-server systems, for mobile phones or network applications · CPC title
G10L2015/223
Execution procedure of a spoken command · CPC title
G10L15/065
Adaptation · CPC title

Patent family

Related publications grouped by family.

View patent family 78008029

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11705116B2 cover?: Systems and methods described herein relate to adapting a language model for automatic speech recognition (ASR) for a new set of words. Instead of retraining the ASR models, language models and grammar models, the system only modifies one grammar model and ensures its compatibility with the existing models in the ASR system.
Who is the assignee on this patent?: Amazon Tech Inc
What technology area does this patent fall under?: Primary CPC classification G10L15/193. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jul 18 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 11 related publications on this page (citations in our corpus or others sharing the same primary CPC).