Method and electronic device
US-2023274758-A1 · Aug 31, 2023 · US
US12548586B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12548586-B2 |
| Application number | US-202318097062-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 13, 2023 |
| Priority date | Feb 22, 2022 |
| Publication date | Feb 10, 2026 |
| Grant date | Feb 10, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A generative adversarial network-based audio signal generation model for generating a high quality audio signal may comprise: a generator generating an audio signal with an external input; a harmonic-percussive separation model separating the generated audio signal into a harmonic component signal and a percussive component signal; and at least one discriminator evaluating whether each of the harmonic component signal and the percussive component signal is real or fake.
Opening claim text (preview).
What is claimed is: 1 . A generative adversarial network-based audio signal generation model executed by a processor to generate a high quality audio signal, the audio signal generation model comprising: a generator generating an audio signal with an external input; a harmonic-percussive separation model separating the generated audio signal into a harmonic component signal and a percussive component signal; a first discriminator evaluating whether the harmonic component signal is real or fake; and a second discriminator evaluating whether the percussive component signal is real or fake, wherein the first discriminator has a first kernel dilation factor greater than a second kernel dilation factor of the second discriminator, and the first discriminator has a first receptive field greater than a second receptive field of the second discriminator, wherein the generator is trained to minimize errors between samples of real signals and audio signals generated by the generator, using a restoration loss function applied to the generator, in a first phase training, and wherein the generator, the harmonic-percussive separation model, the first discriminator, and the second discriminator are adversarial trained through end-to-end learning, after the first phase training, in a second phase training. 2 . The signal generation model of claim 1 , wherein the generator and the at least one discriminator allow error backpropagation of a loss function. 3 . The signal generation model of claim 1 , wherein the harmonic-percussive separation model comprises: a short-time Fourier transform model converting the generated audio signal into a spectrogram; a harmonic masking model and a percussive masking model masking a harmonic component and a percussive component, respectively; and an inverse short-time Fourier transform module converting the masked spectrogram into the audio signal. 4 . A learning method of a generative adversarial network-based audio signal generation model executed by a processor, wherein the method comprising: (a) generating, by a generator, an audio signal; (b) separating the generated audio signal into a harmonic component signal and a percussive component signal using a harmonic-percussive separation model; (c) evaluating, by a first discriminator, whether the harmonic component signal is real or fake, and (d) evaluating, by a second discriminator, whether the percussive component signal is real or fake, wherein the first discriminator has a first kernel dilation factor greater than a second kernel dilation factor of the second discriminator, and the first discriminator has a first receptive field greater than a second receptive field of the second discriminator, wherein the generator is trained to minimize errors between samples of real signals and audio signals generated by the generator, using a restoration loss function applied to the generator, in a first phase training, and wherein (a) to (d) are performed repeatedly for the generator, the harmonic-percussive separation model, the first discriminator, and the second discriminator to learn in a backward propagation manner for adversarial training through end-to-end learning after the first phase training, as a second phase training. 5 . An apparatus for generating an audio signal using a generative adversarial network, the apparatus comprising: a memory configured to store at least one instruction; a processor configured to execute the at least one instruction stored in the memory, a generator generating an audio signal with an external input; a harmonic-percussive separation model separating the generated audio signal into a harmonic component signal and a percussive component signal; a first discriminator evaluating whether the harmonic component signal is real or fake; and a second discriminator evaluating whether the percussive component signal is real or fake, wherein the first discriminator has a first kernel dilation factor greater than a second kernel dilation factor of the second discriminator, and the first discriminator has a first receptive field greater than a second receptive field of the second discriminator, wherein the processor is configured to: train the generator to minimize errors between samples of real signals and audio signals generated by the generator, using a restoration loss function applied to the generator, in a first phase training, and adversarial train the generator, the harmonic-percussive separation model, the first discriminator, and the second discriminator, through end-to-end learning, after the first phase training, in a second phase training. 6 . The apparatus of claim 5 , wherein the generator and the at least one discriminator allow error backpropagation of a loss function.
using neural networks · CPC title
Non-supervised learning, e.g. competitive learning · CPC title
Backpropagation, e.g. using gradient descent · CPC title
Activation functions · CPC title
Generative networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.