What technology area does this patent fall under?

Primary CPC classification G10L15/065. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Aug 23 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

Privacy-sensitive speech model creation via aggregation of multiple user models

US9424836B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9424836-B2
Application number	US-201514745630-A
Country	US
Kind code	B2
Filing date	Jun 22, 2015
Priority date	Nov 5, 2012
Publication date	Aug 23, 2016
Grant date	Aug 23, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques disclosed herein include systems and methods for privacy-sensitive training data collection for updating acoustic models of speech recognition systems. In one embodiment, the system locally creates adaptation data from raw audio data. Such adaptation can include derived statistics and/or acoustic model update parameters. The derived statistics and/or updated acoustic model data can then be sent to a speech recognition server or third-party entity. Since the audio data and transcriptions are already processed, the statistics or acoustic model data is devoid of any information that could be human-readable or machine readable such as to enable reconstruction of audio data. Thus, such converted data sent to a server does not include personal or confidential information. Third-party servers can then continually update speech models without storing personal and confidential utterances of users.

First claim

Opening claim text (preview).

The invention claimed is: 1. A computer-implemented method comprising acts of: receiving, via at least one network, adaptation data generated at least in part by performing statistical processing on audio data comprising at least one user utterance; and using the adaptation data to update at least one acoustic model for use in speech recognition processing, wherein the adaptation data is in a format that prevents reconstruction of the audio data. 2. The computer-implemented method of claim 1 , wherein the adaptation data is received via the at least one network in an encrypted form. 3. The computer-implemented method of claim 1 , wherein: the adaptation data comprises first adaptation data received from a first device and second adaptation data received from a second device different from the first device; and the act of using the adaptation data comprises aggregating the first adaptation data and the second adaptation data. 4. The computer-implemented method of claim 1 , wherein: the adaptation data comprises first adaptation data and second adaptation data; the first adaptation data is generated at least in part by performing statistical processing on first audio data comprising at least one first utterance spoken by a first user; the second adaptation data is generated at least in part by performing statistical processing on second audio data comprising at least one second utterance spoken by a second user different from the first user; and the act of using the adaptation data comprises aggregating the first adaptation data and the second adaptation data. 5. The computer-implemented method of claim 1 , wherein the adaptation data is generated at least in part by performing statistical processing on at least a selected threshold amount of audio data. 6. The computer-implemented method of claim 5 , wherein the selected threshold amount of audio data is at least 100 utterances. 7. The computer-implemented method of claim 1 , wherein the adaptation data comprises at least one update to at least one component of the at least one acoustic model. 8. A system comprising: at least one memory storing executable instructions; and at least one processor programmed by the executable instructions to perform a method comprising acts of: receiving, via at least one network, adaptation data generated at least in part by performing statistical processing on audio data comprising at least one user utterance; and using the adaptation data to update at least one acoustic model for use in speech recognition processing, wherein the adaptation data is in a format that prevents reconstruction of the audio data. 9. The system of claim 8 , wherein the adaptation data is received via the at least one network in an encrypted form. 10. The system of claim 8 , wherein: the adaptation data comprises first adaptation data received from a first device and second adaptation data received from a second device different from the first device; and the act of using the adaptation data comprises aggregating the first adaptation data and the second adaptation data. 11. The system of claim 8 , wherein: the adaptation data comprises first adaptation data and second adaptation data; the first adaptation data is generated at least in part by performing statistical processing on first audio data comprising at least one first utterance spoken by a first user; the second adaptation data is generated at least in part by performing statistical processing on second audio data comprising at least one second utterance spoken by a second user different from the first user; and the act of using the adaptation data comprises aggregating the first adaptation data and the second adaptation data. 12. The system of claim 8 , wherein the adaptation data is generated at least in part by performing statistical processing on at least a selected threshold amount of audio data. 13. The system of claim 12 , wherein the selected threshold amount of audio data is at least 100 utterances. 14. The system of claim 8 , wherein the adaptation data comprises at least one update to at least one component of the at least one acoustic model. 15. At least one non-transitory computer-readable medium having encoded thereon executable instructions which, when executed by at least one processor, cause the at least one processor to perform a method comprising acts of: receiving, via at least one network, adaptation data generated at least in part by performing statistical processing on audio data comprising at least one user utterance; and using the adaptation data to update at least one acoustic model for use in speech recognition processing, wherein the adaptation data is in a format that prevents reconstruction of the audio data. 16. The at least one non-transitory computer-readable medium of claim 15 , wherein the adaptation data is received via the at least one network in an encrypted form. 17. The at least one non-transitory computer-readable medium of claim 15 , wherein: the adaptation data comprises first adaptation data received from a first device and second adaptation data received from a second device different from the first device; and the act of using the adaptation data comprises aggregating the first adaptation data and the second adaptation data. 18. The at least one non-transitory computer-readable medium of claim 15 , wherein: the adaptation data comprises first adaptation data and second adaptation data; the first adaptation data is generated at least in part by performing statistical processing on first audio data comprising at least one first utterance spoken by a first user; the second adaptation data is generated at least in part by performing statistical processing on second audio data comprising at least one second utterance spoken by a second user different from the first user; and the act of using the adaptation data comprises aggregating the first adaptation data and the second adaptation data. 19. The at least one non-transitory computer-readable medium of claim 15 , wherein the adaptation data is generated at least in part by performing statistical processing on at least a selected threshold amount of audio data. 20. The at least one non-transitory computer-readable medium of claim 19 , wherein the selected threshold amount of audio data is at least 100 utterances. 21. The at least one non-transitory computer-readable medium of claim 15 , wherein the adaptation data comprises at least one update to at least one component of the at least one acoustic model.

Assignees

Nuance Communications Inc

Inventors

Classifications

G10L15/065Primary
Adaptation · CPC title
G06F21/6245
Protecting personal data, e.g. for financial or medical purposes · CPC title
H04L63/0407
wherein the identity of one or more communicating identities is hidden (cryptographic mechanisms or cryptographic arrangements for anonymous credentials or for identity based cryptographic systems H04L9/00) · CPC title
G06F21/78
to assure secure storage of data (address-based protection against unauthorised use of memory G06F12/14; record carriers for use with machines and with at least a part designed to carry digital markings G06K19/00) · CPC title
G10L15/04
Segmentation; Word boundary detection · CPC title

Patent family

Related publications grouped by family.

View patent family 50623178

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9424836B2 cover?: Techniques disclosed herein include systems and methods for privacy-sensitive training data collection for updating acoustic models of speech recognition systems. In one embodiment, the system locally creates adaptation data from raw audio data. Such adaptation can include derived statistics and/or acoustic model update parameters. The derived statistics and/or updated acoustic model data can t…
Who is the assignee on this patent?: Nuance Communications Inc
What technology area does this patent fall under?: Primary CPC classification G10L15/065. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Aug 23 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).