Who is the assignee on this patent?

Vivo Mobile Communication Co Ltd

What technology area does this patent fall under?

Primary CPC classification G10L21/0232. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Apr 07 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Speech signal enhancement method and apparatus, and electronic device

US12597433B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12597433-B2
Application number	US-202318484927-A
Country	US
Kind code	B2
Filing date	Oct 11, 2023
Priority date	Apr 16, 2021
Publication date	Apr 7, 2026
Grant date	Apr 7, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A speech signal enhancement method includes: performing noise reduction processing on a first speech signal according to a first time-frequency spectrum and a first power spectrum to obtain a second speech signal, where the first time-frequency spectrum is used to indicate a time domain feature and a frequency domain feature of the first speech signal, and the first power spectrum is a power spectrum of a noise signal in the first speech signal; determining a voiced signal in the second speech signal, and performing gain compensation on the voiced signal; and determining a damage compensation gain of the second speech signal according to the voiced signal on which the gain compensation has been performed, and performing gain compensation on the second speech signal based on the damage compensation gain.

First claim

Opening claim text (preview).

What is claimed is: 1 . A speech signal enhancement method, comprising: performing noise reduction processing on a first speech signal according to a first time-frequency spectrum and a first power spectrum to obtain a second speech signal, wherein the first time-frequency spectrum is used to indicate a time domain feature and a frequency domain feature of the first speech signal, and the first power spectrum is a power spectrum of a noise signal in the first speech signal; determining a voiced signal in the second speech signal, and performing gain compensation on the voiced signal, wherein the voiced signal is a signal with a cepstral coefficient greater than or equal to a preset threshold in the second speech signal; and determining a damage compensation gain of the second speech signal according to the voiced signal on which the gain compensation has been performed, and performing gain compensation on the second speech signal based on the damage compensation gain; wherein the determining a voiced signal in the second speech signal, and performing gain compensation on the voiced signal comprises: performing homomorphic positive analysis processing on the second speech signal to obtain a target cepstral coefficient of the second speech signal: determining a maximum cepstral coefficient in the target cepstral coefficient, and determining a signal corresponding to the maximum cepstral coefficient in the second speech signal as the voiced signal; and performing gain amplification processing on the maximum cepstral coefficient, to perform gain compensation on the voiced signal. 2 . The method according to claim 1 , wherein before the performing noise reduction processing on a first speech signal according to a first time-frequency spectrum and a first power spectrum, the method further comprises: performing a short-time Fourier transform on the first speech signal to obtain the first time-frequency spectrum; determining a power spectrum of the first speech signal according to the first time-frequency spectrum, and determining a target power spectrum in the power spectrum of the first speech signal, wherein the target power spectrum is a power spectrum of a signal with a smallest power spectrum in signals within a preset time window; and performing recursive smoothing processing on the target power spectrum to obtain the first power spectrum. 3 . The method according to claim 1 , wherein the performing noise reduction processing on a first speech signal according to a first time-frequency spectrum and a first power spectrum comprises: determining a posterior signal-to-noise ratio corresponding to the first speech signal according to the first power spectrum and the power spectrum of the first speech signal, and performing recursive smoothing processing on the posterior signal-to-noise ratio to obtain a prior signal-to-noise ratio corresponding to the first speech signal; determining a target noise reduction gain according to the posterior signal-to-noise ratio and the prior signal-to-noise ratio; and performing noise reduction processing on the first speech signal according to the first time-frequency spectrum and the target noise reduction gain. 4 . The method according to claim 1 , wherein the determining a damage compensation gain of the second speech signal according to the voiced signal on which the gain compensation has been performed comprises: performing homomorphic inverse analysis processing on a first cepstral coefficient and the maximum cepstral coefficient on which the gain amplification processing has been performed, to obtain a first logarithmic time-frequency spectrum, wherein the first cepstral coefficient is a cepstral coefficient in the target cepstral coefficient other than the maximum cepstral coefficient; and determining a logarithmic time-frequency spectrum of the second speech signal according to a time-frequency spectrum of the second speech signal, and determining the damage compensation gain according to a difference between the first logarithmic time-frequency spectrum and the logarithmic time-frequency spectrum of the second speech signal. 5 . The method according to claim 1 , wherein the second speech signal is a signal obtained by performing noise reduction processing on a target frequency domain signal, and the target frequency domain signal is a signal obtained by performing a short-time Fourier transform on the first speech signal; and after the performing gain compensation on the second speech signal based on the damage compensation gain, the method further comprises: performing time-frequency inverse transform processing on the second speech signal on which the gain compensation has been performed, to obtain a target time domain signal, and outputting the target time domain signal. 6 . A chip, comprising a processor and a communication interface, wherein the communication interface is coupled to the processor, and the processor is configured to run a program or an instruction to implement the speech signal enhancement method according to claim 1 . 7 . An electronic device, comprising a processor, a memory, and a program or an instruction stored in the memory and runnable on the processor, wherein the program or the instruction is executed by the processor to implement: performing noise reduction processing on a first speech signal according to a first time-frequency spectrum and a first power spectrum to obtain a second speech signal, wherein the first time-frequency spectrum is used to indicate a time domain feature and a frequency domain feature of the first speech signal, and the first power spectrum is a power spectrum of a noise signal in the first speech signal; determining a voiced signal in the second speech signal, and performing gain compensation on the voiced signal, wherein the voiced signal is a signal with a cepstral coefficient greater than or equal to a preset threshold in the second speech signal; and determining a damage compensation gain of the second speech signal according to the voiced signal on which the gain compensation has been performed, and performing gain compensation on the second speech signal based on the damage compensation gain; wherein the determining a voiced signal in the second speech signal, and performing gain compensation on the voiced signal comprises: performing homomorphic positive analysis processing on the second speech signal to obtain a target cepstral coefficient of the second speech signal; determining a maximum cepstral coefficient in the target cepstral coefficient, and determining a signal corresponding to the maximum cepstral coefficient in the second speech signal as the voiced signal; and performing gain amplification processing on the maximum cepstral coefficient, to perform gain compensation on the voiced signal. 8 . The electronic device according to claim 7 , wherein before the performing noise reduction processing on a first speech signal according to a first time-frequency spectrum and a first power spectrum, the method further comprises: performing a short-time Fourier transform on the first speech signal to obtain the first time-frequency spectrum; determining a power spectrum of the first speech signal according to the first time-frequency spectrum, and determining a target power spectrum in the power spectrum of the first speech signal, wherein the target power spectrum is a power spectrum of a signal with a smallest power spectrum in signals within a preset time window; and performing recursive smoothing processing on the target power spectrum to obtain the first power spectrum. 9 . The electronic device according to claim 7 , wherein the performing noise reduction processing on a first speech s

Assignees

Vivo Mobile Communication Co Ltd

Inventors

Yang Hongbo

Classifications

G10L25/21
the extracted parameters being power information · CPC title
G10L25/93
Discriminating between voiced and unvoiced parts of speech signals (G10L25/90 takes precedence) · CPC title
G10L25/24
the extracted parameters being the cepstrum · CPC title
G10L21/0232Primary
Processing in the frequency domain · CPC title
G10L21/0224Primary
Processing in the time domain · CPC title

Patent family

Related publications grouped by family.

View patent family 77128304

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12597433B2 cover?: A speech signal enhancement method includes: performing noise reduction processing on a first speech signal according to a first time-frequency spectrum and a first power spectrum to obtain a second speech signal, where the first time-frequency spectrum is used to indicate a time domain feature and a frequency domain feature of the first speech signal, and the first power spectrum is a power sp…
Who is the assignee on this patent?: Vivo Mobile Communication Co Ltd
What technology area does this patent fall under?: Primary CPC classification G10L21/0232. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Apr 07 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Audio processing device and a method for estimating a signal-to-noise-ratio of a sound signal

Speech Enhancement Method and Apparatus

Acoustic signal processing device, acoustic signal processing method, and hands-free communication device

Noise reduction system and method for audio device with multiple microphones

Method and display device for recognizing voice

Speech Intelligibility

Formant Dependent Speech Signal Enhancement

Frequently asked questions