What technology area does this patent fall under?

Primary CPC classification G10L15/16. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jul 29 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

4-bit conformer with accurate quantization training for speech recognition

US12374323B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12374323-B2
Application number	US-202318186774-A
Country	US
Kind code	B2
Filing date	Mar 20, 2023
Priority date	Mar 21, 2022
Publication date	Jul 29, 2025
Grant date	Jul 29, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for training a model includes obtaining a plurality of training samples. Each respective training sample of the plurality of training samples includes a respective speech utterance and a respective textual utterance representing a transcription of the respective speech utterance. The method includes training, using quantization aware training with native integer operations, an automatic speech recognition (ASR) model on the plurality of training samples. The method also includes quantizing the trained ASR model to an integer target fixed-bit width. The quantized trained ASR model includes a plurality of weights. Each weight of the plurality of weights includes an integer with the target fixed-bit width. The method includes providing the quantized trained ASR model to a user device.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method when executed on data processing hardware causes the data processing hardware to perform operations comprising: obtaining a plurality of training samples, each respective training sample of the plurality of training samples comprising: a respective speech utterance; and a respective textual utterance representing a transcription of the respective speech utterance; training, using quantization aware training with native integer operations, an automatic speech recognition (ASR) model on the plurality of training samples; quantizing the trained ASR model to an integer target fixed-bit width, the quantized trained ASR model comprising a plurality of weights, each weight of the plurality of weights comprising an integer with the target fixed-bit width; and providing the quantized trained ASR model to a user device. 2. The method of claim 1 , wherein the target fixed-bit width is four. 3. The method of claim 1 , wherein the ASR model further comprises a plurality of activations, each activation of the plurality of activations comprising an integer with the target fixed-bit width. 4. The method of claim 1 , wherein the ASR model further comprises a plurality of activations, each activation of the plurality of activations comprising an integer with a fixed bit width greater than the target fixed-bit width. 5. The method of claim 1 , wherein the ASR model further comprises a plurality of activations, each activation of the plurality of activations comprising a float value. 6. The method of claim 1 , wherein quantizing the trained ASR model comprises determining a scale factor based on an estimated max value of an axis to be quantized and the target fixed-bit width. 7. The method of claim 1 , wherein the ASR model comprises one or more multi-head attention layers. 8. The method of claim 7 , wherein the one or more multi-head attention layers comprise one or more conformer layers or one or more transformer layers. 9. The method of claim 1 , wherein: the ASR model comprises a plurality of encoders and a plurality of decoders; and quantizing the ASR model comprises quantizing the plurality of encoders and not quantizing the plurality of decoders. 10. The method of claim 1 , wherein: the ASR model comprises an audio encoder; and the audio encoder comprises a cascaded encoder comprising a first causal encoder and a second non-causal encoder. 11. A system comprising: data processing hardware; and memory hardware in communication with the data processing hardware, the memory hardware storing instructions that when executed on the data processing hardware cause the data processing hardware to perform operations comprising: obtaining a plurality of training samples, each respective training sample of the plurality of training samples comprising: a respective speech utterance; and a respective textual utterance representing a transcription of the respective speech utterance; training, using quantization aware training with native integer operations, an automatic speech recognition (ASR) model on the plurality of training samples; quantizing the trained ASR model to an integer target fixed-bit width, the quantized trained ASR model comprising a plurality of weights, each weight of the plurality of weights comprising an integer with the target fixed-bit width; and providing the quantized trained ASR model to a user device. 12. The system of claim 11 , wherein the target fixed-bit width is four. 13. The system of claim 11 , wherein the ASR model further comprises a plurality of activations, each activation of the plurality of activations comprising an integer with the target fixed-bit width. 14. The system of claim 11 , wherein the ASR model further comprises a plurality of activations, each activation of the plurality of activations comprising an integer with a fixed bit width greater than the target fixed-bit width. 15. The system of claim 11 , wherein the ASR model further comprises a plurality of activations, each activation of the plurality of activations comprising a float value. 16. The system of claim 11 , wherein quantizing the trained ASR model comprises determining a scale factor based on an estimated max value of an axis to be quantized and the target fixed-bit width. 17. The system of claim 11 , wherein the ASR model comprises one or more multi-head attention layers. 18. The system of claim 17 , wherein the one or more multi-head attention layers comprise one or more conformer layers or one or more transformer layers. 19. The system of claim 11 , wherein: the ASR model comprises a plurality of encoders and a plurality of decoders; and quantizing the ASR model comprises quantizing the plurality of encoders and not quantizing the plurality of decoders. 20. The system of claim 11 , wherein: the ASR model comprises an audio encoder; and the audio encoder comprises a cascaded encoder comprising a first causal encoder and a second non-causal encoder.

Assignees

Google Llc

Inventors

Classifications

G10L15/16Primary
using artificial neural networks · CPC title
G06N3/045
Combinations of networks · CPC title
G06N3/08
Learning methods · CPC title
G10L19/035
Scalar quantisation · CPC title
G10L15/063Primary
Training · CPC title

Patent family

Related publications grouped by family.

View patent family 86007455

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12374323B2 cover?: A method for training a model includes obtaining a plurality of training samples. Each respective training sample of the plurality of training samples includes a respective speech utterance and a respective textual utterance representing a transcription of the respective speech utterance. The method includes training, using quantization aware training with native integer operations, an automati…
Who is the assignee on this patent?: Google Llc
What technology area does this patent fall under?: Primary CPC classification G10L15/16. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jul 29 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).