What technology area does this patent fall under?

Primary CPC classification G10L15/16. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Mar 12 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Obfuscating audio samples for health privacy contexts

US11929063B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11929063-B2
Application number	US-202117534396-A
Country	US
Kind code	B2
Filing date	Nov 23, 2021
Priority date	Nov 23, 2021
Publication date	Mar 12, 2024
Grant date	Mar 12, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A supervised discriminator for detecting bio-markers in an audio sample dataset is trained and a denoising autoencoder is trained to learn a latent space that is used to reconstruct an output audio sample with a same fidelity as an input audio sample of the audio sample dataset. A conditional auxiliary generative adversarial network (GAN) trained to generate the output audio sample with the same fidelity as the input audio sample, wherein the output audio sample is void of the bio-markers. The conditional auxiliary generative adversarial network (GAN), the corresponding supervised discriminator, and the corresponding denoising autoencoder are deployed in an audio processing system.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: training, using at least one processor, a supervised discriminator to detect bio-markers in an audio sample dataset; training, using the at least one processor, a denoising autoencoder to learn a latent space that is used to reconstruct an output audio sample with a same fidelity as an input audio sample of the audio sample dataset; training, using the at least one processor, a conditional auxiliary generative adversarial network (GAN) to generate the output audio sample with the same fidelity as the input audio sample, wherein the output audio sample is void of the bio-markers; and deploying the conditional auxiliary generative adversarial network (GAN), the supervised discriminator, and the denoising autoencoder in an audio processing system. 2. The method of claim 1 , further comprising minimizing a classification generalization error during the training of the supervised discriminator. 3. The method of claim 1 , wherein the training of the denoising autoencoder to learn the latent space that is used to reconstruct the output audio sample is performed by minimizing a KL-divergence based reconstruction error loss plus a fidelity term. 4. The method of claim 3 , wherein the KL-divergence based reconstruction error loss plus the fidelity term is based on one or more of a frequency response, a distortion, noise, and time-based errors. 5. The method of claim 1 , further comprising using a discriminator function as the supervised discriminator in the conditional auxiliary generative adversarial network (GAN), and the denoising autoencoder as a generator, the conditional auxiliary generative adversarial network (GAN) being trained such that the discriminator function attempts to maximize an entropy that clean samples pass through the discriminator and minimize an entropy that a denoised representation of bad samples containing the bio-markers pass through the supervised discriminator. 6. The method of claim 5 , further comprising freezing the generator and backpropagating through the discriminator function using a gradient from the generative adversarial network loss. 7. The method of claim 6 , further comprising freezing the discriminator function and propagating through the generator using the gradient from the generative adversarial network loss combined with a decaying constant times a reconstruction error loss of the generator. 8. The method of claim 1 , further comprising iterating the training of the denoising autoencoder and the training of the conditional auxiliary generative adversarial network until convergence. 9. The method of claim 1 , wherein the supervised discriminator comprises a convolutional neural network that inputs mel-frequency cepstral coefficients (MFCC) representations of the audio sample dataset and classifies a presence of the bio-marker, where a first classification represents the presence of the corresponding bio-marker and a second classification represents an absence of the corresponding bio-marker. 10. The method of claim 1 , further comprising creating the supervised discriminator via model distillation from a black box teacher model. 11. The method of claim 1 , wherein the training of the supervised discriminator is based on extracted features from a mel-representation of the audio sample dataset. 12. The method of claim 1 , wherein the denoising autoencoder comprises a convolutional neural network that inputs MFCC representations of the audio sample dataset and produces a denoised version of the MFCC representations. 13. The method of claim 1 , further comprising obfuscating one or more bio-markers of speech of a human subject using the conditional auxiliary generative adversarial network (GAN), the supervised discriminator, and the denoising autoencoder so that the audio processing system has access to an intelligible version of the speech but does not have access to the one or more bio-markers of the human subject. 14. An apparatus comprising: a memory; and at least one processor, coupled to said memory, and operative to perform operations comprising: training a supervised discriminator to detect bio-markers in an audio sample dataset; training a denoising autoencoder to learn a latent space that is used to reconstruct an output audio sample with a same fidelity as an input audio sample of the audio sample dataset; training a conditional auxiliary generative adversarial network (GAN) to generate the output audio sample with the same fidelity as the input audio sample, wherein the output audio sample is void of the bio-markers; and deploying the conditional auxiliary generative adversarial network (GAN), the supervised discriminator, and the denoising autoencoder in an audio processing system. 15. The apparatus of claim 14 , the operations further comprising minimizing a classification generalization error during the training of the supervised discriminator. 16. The apparatus of claim 14 , wherein the training of the denoising autoencoder to learn the latent space that is used to reconstruct the output audio sample is performed by minimizing a KL-divergence based reconstruction error loss plus a fidelity term. 17. The apparatus of claim 14 , wherein the operations further comprise using a discriminator function as the supervised discriminator in the conditional auxiliary generative adversarial network (GAN), and the denoising autoencoder as a generator, the conditional auxiliary generative adversarial network (GAN) being trained such that the discriminator function attempts to maximize an entropy that clean samples pass through the discriminator and minimize an entropy that a denoised representation of bad samples containing the bio-markers pass through the supervised discriminator. 18. The apparatus of claim 17 , the operations further comprising freezing the generator and backpropagating through the discriminator function using a gradient from the generative adversarial network loss. 19. The apparatus of claim 18 , the operations further comprising freezing the discriminator function and propagating through the generator using the gradient from the generative adversarial network loss combined with a decaying constant times a reconstruction error loss of the generator. 20. A computer program product for federated learning, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a computer to cause the computer to perform operations comprising: training a supervised discriminator to detect bio-markers in an audio sample dataset; training a denoising autoencoder to learn a latent space that is used to reconstruct an output audio sample with a same fidelity as an input audio sample of the audio sample dataset; training a conditional auxiliary generative adversarial network (GAN) to generate the output audio sample with the same fidelity as the input audio sample, wherein the output audio sample is void of the bio-markers; and deploying the conditional auxiliary generative adversarial network (GAN), the supervised discriminator, and the denoising autoencoder in an audio processing system.

Assignees

Inventors

Classifications

G10L15/16Primary
using artificial neural networks · CPC title
G06N3/045
Combinations of networks · CPC title
G06N3/084
Backpropagation, e.g. using gradient descent · CPC title
G10L25/30
using neural networks · CPC title
G10L25/87
Detection of discrete points within a voice signal · CPC title

Patent family

Related publications grouped by family.

View patent family 84389292

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11929063B2 cover?: A supervised discriminator for detecting bio-markers in an audio sample dataset is trained and a denoising autoencoder is trained to learn a latent space that is used to reconstruct an output audio sample with a same fidelity as an input audio sample of the audio sample dataset. A conditional auxiliary generative adversarial network (GAN) trained to generate the output audio sample with the sam…
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification G10L15/16. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Mar 12 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

System and method for continuous privacy-preserved audio collection

Processing speech signals in voice-based profiling

Comprehensive and context-sensitive neonatal pain assessment system and methods using multiple modalities

Frequently asked questions