What technology area does this patent fall under?

Primary CPC classification G10L19/06. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Feb 11 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Training method and device for audio separation network, audio separation method and device, and medium

US12223969B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12223969-B2
Application number	US-202217682399-A
Country	US
Kind code	B2
Filing date	Feb 28, 2022
Priority date	Feb 11, 2020
Publication date	Feb 11, 2025
Grant date	Feb 11, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method of training an audio separation network is provided. The method includes obtaining a first separation sample set, the first separation sample set including at least two types of audio with dummy labels, obtaining a first sample set by performing interpolation on the first separation sample set based on perturbation data, obtaining a second separation sample set by separating the first sample set using an unsupervised network, determining losses of second separation samples in the second separation sample set, and adjusting network parameters of the unsupervised network based on the losses of the second separation samples, such that a first loss of a first separation result outputted by an adjusted unsupervised network meets a convergence condition.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of training an audio separation network, the method comprising: obtaining a first separation sample set, the first separation sample set comprising at least two types of audio signals with dummy labels; obtaining a first sample set by performing interpolation on the first separation sample set based on perturbation data; obtaining a second separation sample set by separating the first sample set using an unsupervised network; determining losses of second separation samples in the second separation sample set; and adjusting network parameters of the unsupervised network based on the losses of the second separation samples, such that a first loss of a first separation result outputted by an adjusted unsupervised network meets a convergence condition, wherein the determining the losses of the second separation samples in the second separation sample set comprises obtaining a loss set by: determining a loss between each second separation sample and true value data of the first separation sample set; and obtaining a loss of each second separation sample, and wherein the adjusting the network parameters of the unsupervised network based on the losses of the second separation samples comprises obtaining updated network parameters by: determining a minimum loss from the loss set; and updating the network parameters of the unsupervised network based on the minimum loss. 2. The method of claim 1 , wherein the performing interpolation on the first separation sample set based on the perturbation data comprises obtaining an adjusted data set by multiplying each first separation sample by different perturbation data in a one-to-one manner; and wherein obtaining the first sample set comprises performing summation on adjusted data in the adjusted data set. 3. The method of claim 1 , wherein the obtaining the first separation sample set comprises: obtaining a sample audio signal comprising at least an unlabeled audio signal; separating the sample audio signal according to types of audio signals using a trained supervised network, and obtaining separation samples of each type, wherein network parameters of the trained supervised network are updated based on the network parameters of the unsupervised network. 4. The method of claim 3 , wherein, before the separating the sample audio signal, and the obtaining separation samples of each type, the method further comprises: obtaining a labeled clean sample audio signal and a noise sample audio signal; obtaining a third sample set by mixing the labeled clean sample audio signal and the noise sample audio signal; obtaining a fifth separation sample set by separating the third sample set using a to-be-trained supervised network; determining losses of fifth separation samples in the fifth separation sample set; and obtaining the trained supervised network by adjusting network parameters of the to-be-trained supervised network based on the losses of the fifth separation samples, such that a third loss of a third separation result outputted by an adjusted to-be-trained supervised network meets the convergence condition. 5. The method of claim 1 , wherein, after the updating the network parameters of the unsupervised network based on the minimum loss, the method further comprises: adjusting the network parameters of the trained supervised network by obtaining an updated supervised network by feeding back the updated network parameters to the trained supervised network. 6. The method of claim 5 , wherein the feeding back the updated network parameters to the trained supervised network comprises determining moving average values of the updated network parameters, wherein the obtaining the updated supervised network further comprises adjusting the network parameters of the trained supervised network by feeding back the moving average values to the trained supervised network. 7. The method of claim 6 , wherein, after the feeding back the updated network parameters to the trained supervised network, the method further comprises: obtaining a third separation sample set by separating the sample audio signal again by using the updated supervised network; obtaining a second sample set by performing interpolation on the third separation sample set by using the perturbation data; inputting the second sample set into an updated unsupervised network; obtaining a fourth separation sample set by performing prediction and separation on the second sample set using the updated unsupervised network; determining losses of fourth separation samples in the fourth separation sample set; and adjusting the network parameters of the updated unsupervised network and the network parameters of the updated supervised network using the losses of the fourth separation samples, such that a second loss of a second separation result outputted by an adjusted updated unsupervised network meets the convergence condition. 8. An apparatus for training an audio separation network, the apparatus comprising: at least one memory configured to store computer program code; and at least one processor configured to access said computer program code and operate as instructed by said computer program code, said computer program code comprising: first obtaining code configured to cause the at least one processor to obtain a first separation sample set, the first separation sample set comprising at least two types of audio signals with dummy labels; second obtaining code configured to cause the at least one processor to obtain a first sample set by performing interpolation on the first separation sample set based on perturbation data; third obtaining code configured to cause the at least one processor to obtain a second separation sample set by separating the first sample set using an unsupervised network; first determining code configured to cause the at least one processor to determine losses of second separation samples in the second separation sample set; and first adjusting code configured to cause the at least one processor to adjust network parameters of the unsupervised network based on the losses of the second separation samples, such that a first loss of a first separation result outputted by an adjusted unsupervised network meets a convergence condition, wherein the first determining code further causes the at least one processor to obtain a loss set by: determining a loss between each second separation sample and true value data of the first separation sample set; and obtaining a loss of each second separation sample, and wherein the first adjusting code is further configured to cause the at least one processor to obtain updated network parameters by: determining a minimum loss from the loss set; and updating the network parameters of the unsupervised network based on the minimum loss. 9. The apparatus of claim 8 , wherein the performing interpolation on the first separation sample set based on the perturbation data comprises obtaining an adjusted data set by multiplying each first separation sample by different perturbation data in a one-to-one manner; and wherein the second obtaining code is further configured to cause the at least one processor to obtain the first sample set by performing summation on adjusted data in the adjusted data set. 10. The apparatus of claim 8 , wherein the first obtaining code further causes the at least one processor to: obtain a sample audio signal comprising at least an unlabeled audio signal; separate the sample audio signal according to types of audio signals using a trained supervised network, and obtain separation samples of each type, wherein network parameters of the trained supervised network a

Assignees

Tencent Tech Shenzhen Co Ltd

Inventors

Classifications

G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title
G06N3/0895
Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title
G06N3/09
Supervised learning · CPC title
G06N3/045
Combinations of networks · CPC title
G10L25/30
using neural networks · CPC title

Patent family

Related publications grouped by family.

View patent family 71183362

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12223969B2 cover?: A method of training an audio separation network is provided. The method includes obtaining a first separation sample set, the first separation sample set including at least two types of audio with dummy labels, obtaining a first sample set by performing interpolation on the first separation sample set based on perturbation data, obtaining a second separation sample set by separating the first …
Who is the assignee on this patent?: Tencent Tech Shenzhen Co Ltd
What technology area does this patent fall under?: Primary CPC classification G10L19/06. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Feb 11 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

System and Method for Unsupervised Domain Adaptation with Mixup Training

Robust neural network acoustic model with side task prediction of reference signals

Source separation using nonnegative matrix factorization with an automatically determined number of bases

Frequently asked questions