What technology area does this patent fall under?

Primary CPC classification G10L15/1822. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jun 13 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

On-device learning in a hybrid speech processing system

US11676575B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11676575-B2
Application number	US-202117386078-A
Country	US
Kind code	B2
Filing date	Jul 27, 2021
Priority date	Nov 13, 2018
Publication date	Jun 13, 2023
Grant date	Jun 13, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A speech interface device is configured to receive response data from a remote speech processing system for responding to user speech. This response data may be enhanced with information such as remote NLU data. The response data from the remote speech processing system may be compared to local NLU data to improve a speech processing model on the device. Thus, the device may perform supervised on-device learning based on the remote NLU data. The device may determine differences between the updated speech processing model and an original speech processing model received from the remote system and may send data indicating these differences to the remote system. The remote system may aggregate data received from a plurality of devices and may generate an improved speech processing model.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method, comprising: determining, by a first device, audio data representing an utterance; performing, by the first device, speech processing based on the audio data using a first model to determine output data; generating a second model configured to be used during speech processing; generating first data representing at least one difference between the first model and the second model; sending the first data to a remote system; receiving, by the first device and from the remote system, second data corresponding to a third model based at least in part on the first data; determining, by the first device, second audio data representing a second utterance; and performing, by the first device, speech processing based on the second audio data using the third model to determine second output data. 2. The computer-implemented method of claim 1 , wherein the second data comprises weight data and the computer-implemented method further comprises, by the first device: processing the weight data with respect to at least one of the first model or the second model to generate the third model. 3. The computer-implemented method of claim 1 , wherein the second data comprises training data and the computer-implemented method further comprises, by the first device: processing the training data with respect to at least one of the first model or the second model to generate the third model. 4. The computer-implemented method of claim 1 , wherein: generating the first data comprises determining weight data representing the at least one difference between the first model and the second model; and sending the first data to the remote system comprises sending the weight data to the remote system. 5. A computer-implemented method comprising: determining, by a first device, audio data representing an utterance; performing, by the first device, speech processing based on the audio data using a first model to determine output data; generating a second model configured to be used during speech processing; determining a first difference value between a first weight value associated with the first model and a second weight value associated with the second model; determining a second difference value between a third weight value associated with the first model and a fourth weight value associated with the second model; determining that the first difference value is above a threshold value; determining that the second difference value is below the threshold value; generating first data representing at least one difference between the first model and the second model, the first data including the first difference value, but not the second difference value; and sending the first data to a remote system. 6. The computer-implemented method of claim 1 , further comprising: sending, by the first device to a different device, the audio data; and receiving, by the first device from the different device, second output data, wherein generating the second model is based at least in part on the second output data. 7. The computer-implemented method of claim 6 , further comprising: determining second data representing at least one difference between the output data and the second output data, wherein generating the second model is based at least in part on the second data. 8. The computer-implemented method of claim 1 , further comprising, prior to determining the audio data: receiving, by the first device and from the remote system, the first model. 9. The computer-implemented method of claim 1 , further comprising: during generating the second model, detecting a second utterance; halting generation of the second model; performing speech processing with regard to the second utterance; and following the speech processing with regard to the second utterance, resuming generation of the second model. 10. A system comprising: at least one processor; and at least one memory comprising instructions that, when executed by the at least one processor, cause the system to: determine, by a first device, audio data representing an utterance; perform, by the first device, speech processing based on the audio data using a first model to determine first output data; send, by the first device to a different device, the audio data; receive, by the first device and from the different device, second output data; determine first data representing at least one difference between the first output data and the second output data; generate, based at least in part on the first data, a second model configured to be used during speech processing; generate second first data representing at least one difference between the first model and the second model; and send the second data to a remote system. 11. The system of claim 10 , wherein the at least one memory further comprises instructions that, when executed by the at least one processor, further cause the system to: receive, by the first device and from the remote system, third data corresponding to a third model based at least in part on the second data; determine, by the first device, second audio data representing a second utterance; and perform, by the first device, speech processing based on the second audio data using the third model to determine third output data. 12. The system of claim 11 , wherein the third data comprises weight data and wherein the at least one memory further comprises instructions that, when executed by the at least one processor, further cause the first device to: process the weight data with respect to at least one of the first model or the second model to generate the third model. 13. The system of claim 11 , wherein the third data comprises training data and wherein the at least one memory further comprises instructions that, when executed by the at least one processor, further cause the first device to: process the training data with respect to at least one of the first model or the second model to generate the third model. 14. The system of claim 10 , wherein the at least one memory further comprises instructions that, when executed by the at least one processor, further cause the system to: generate the second data at least in part by determining weight data representing the at least one difference between the first model and the second model; and send of the second data to the remote system at least in part by sending the weight data to the remote system. 15. The system of claim 10 , wherein the at least one memory further comprises instructions that, when executed by the at least one processor, further cause the system to: determine a first difference value between a first weight value associated with the first model and a second weight value associated with the second model; determine a second difference value between a third weight value associated with the first model and a fourth weight value associated with the second model; determine that the first difference value is above a threshold value; determine that the second difference value is below the threshold value; and generate the second data at least in part by including the first difference value, but not the second difference value, in the second data. 16. The system of claim 10 , wherein the at least one memory further comprises instructions that, when executed by the at least one processor, further cause the system to, prior to determination of the audio data: receive, by the first device and from the remote system, the first model. 17. The system of claim 10 ,

Assignees

Amazon Tech Inc

Inventors

Classifications

G06F40/216
using statistical methods · CPC title
G10L15/32
Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems · CPC title
G06F40/295
Named entity recognition · CPC title
G10L15/1822Primary
Parsing for meaning understanding · CPC title
G10L15/063Primary
Training · CPC title

Patent family

Related publications grouped by family.

View patent family 77179371

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11676575B2 cover?: A speech interface device is configured to receive response data from a remote speech processing system for responding to user speech. This response data may be enhanced with information such as remote NLU data. The response data from the remote speech processing system may be compared to local NLU data to improve a speech processing model on the device. Thus, the device may perform supervised …
Who is the assignee on this patent?: Amazon Tech Inc
What technology area does this patent fall under?: Primary CPC classification G10L15/1822. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jun 13 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).