Method for processing voice in interior environment of vehicle and electronic device using noise data based on input signal to noise ratio

US11017799B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11017799-B2
Application numberUS-201816160277-A
CountryUS
Kind codeB2
Filing dateOct 15, 2018
Priority dateDec 30, 2017
Publication dateMay 25, 2021
Grant dateMay 25, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure discloses a method for processing a voice in interior environment of a vehicle, an electronic device and a storage medium. The method includes the following. A reference audio is acquired, and the reference audio is recorded to obtain a recorded audio. A pure voice is acquired. Noise data for each part or period of the recorded audio satisfying a target signal-to-noise ratio condition pertaining to that part is selected from the recorded audio, and the noise data is superimposed to the pure data to obtain a noisy voice. The noisy voice and the reference audio are inputted to an acoustic echo canceller (AEC) module as inputted data. The AEC module is configured to perform an echo cancellation operation on the inputted data to obtain training data having AEC residual noise.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for processing a voice in interior environment of a vehicle, comprising: acquiring a piece of reference audio, wherein the reference audio comprises music, radio broadcast or text to speech broadcast, obtaining a piece of recorded reference audio by recording the piece of reference audio, wherein the piece of recorded reference audio at least comprises a first part satisfying a first signal-to-noise ratio condition and a second part satisfying a second signal-to-noise ratio condition; acquiring a piece of pure voice; obtaining noise data by determining a first decibel number for the first part and a second decibel number of the second part based on a signal-to-noise ratio distribution of the recorded reference audio, wherein the signal-to-noise ratio distribution of the recorded reference audio is obtained by: obtaining pieces of recorded sample audio by recording a piece of sample reference audio in different in-vehicle scenarios, and obtaining a signal-to-noise ratio distribution of the recorded reference audio based on signal-to-noise ratios of the pieces of recorded sample audios; superimposing the noise data to the pure voice as a noisy voice; and inputting the noisy voice and the reference audio to an acoustic echo canceller (AEC) module as inputted data, wherein the AEC module is configured to perform an echo cancellation operation on the inputted data to obtain training data having AEC residual noise. 2. The method according to claim 1 , wherein the signal-to-noise ratio distribution is obtained by: acquiring noise decibels when obtaining the pieces of recorded sample audio by recording the sample reference audio in different in-vehicle scenarios; obtaining a volume range of normally speaking by a sample user in different in-vehicle scenarios by performing a statistic; and calculating the signal-to-noise ratio distribution according to the noise decibels and the volume range. 3. The method according to claim 2 , wherein calculating the signal-to-noise ratio distribution according to the noise decibels and the volume range comprises: for each in vehicle scenario, calculating a difference between the noise decibel when obtaining a piece of recorded sample audios by recording the sample reference audio and a volume value of normally speaking by the sample user; determining the difference as a signal-to-noise ratio in the in-vehicle scenario; and performing a statistic on signal-to-noise ratios obtained in the in-vehicle scenarios to obtain the signal-to-noise ratio distribution. 4. The method according to claim 1 , before superimposing the noise data to the pure voice as the noisy voice, further comprising: superimposing an in-vehicle impulse response and vehicle's noise to the pure voice in turn to obtain a first voice; wherein superimposing the noise data to the pure voice as the noisy voice comprises: superimposing the noise data to the first voice to obtain the noisy voice. 5. The method according to claim 1 , before inputting the noisy voice and the reference audio to the AEC module as the inputted data, further comprising: acquiring a target time corresponding to a starting time of the noise data from time information of the recorded reference audio; and selecting a partial reference audio having the same time duration with the noise data from the reference audio according to the target time, wherein inputting the noisy voice and the reference audio to the AEC module as the inputted data comprises: inputting the noisy voice and the partial reference audio to the AEC module as the inputted data. 6. The method according to claim 1 , further comprising: updating in real time a training model of voice recognition in the interior environment of the vehicle according to the training data having the AEC residual noise; and recognizing voices presented in the vehicle according to the real-time updated training model of voice recognition in the interior environment of the vehicle. 7. An electronic device, comprising a memory, a processor and computer programs stored on the memory and executable by the processor, wherein when the computer programs are executed by the processor, the processor is configured to: acquire a piece of reference audio, wherein the reference audio comprises music, radio broadcast and text to speech broadcast, obtain a piece of recorded reference audio by recording the piece of reference audio, wherein the piece of recorded reference audio at least comprises a first part satisfying a first signal-to-noise ratio conditions and a second part satisfying a second signal-to-noise ratio condition; acquire a piece of pure voice; obtain noise data by determining a first decibel number for the first part and a second decibel number of the second part based on a signal-to-noise ratio distribution of the recorded reference audio, wherein the signal-to-noise ratio distribution of the recorded reference audio is obtained by: obtaining pieces of recorded sample audio by recording a piece of sample reference audio in different in-vehicle scenarios, and obtaining a signal-to-noise ratio distribution of the recorded reference audio based on signal-to-noise ratios of the pieces of recorded sample audios; superimpose the noise data to the pure voice as a noisy voice; and input the noisy voice and the reference audio to an acoustic echo canceller (AEC) module as inputted data, wherein the AEC module is configured to perform an echo cancellation operation on the inputted data to obtain training data having AEC residual noise. 8. The electronic device according to claim 7 , wherein the signal-to-noise ratio distribution is obtained by: acquiring noise decibels when obtaining the pieces of recorded sample audio by recording the sample reference audio in different in-vehicle scenarios; obtaining a volume range of normally speaking by a sample user in different in-vehicle scenarios by performing a statistic; and calculating the signal-to-noise ratio distribution according to the noise decibels and the volume range. 9. The electronic device according to claim 8 , wherein the processor is configured to calculate the signal-to-noise ratio distribution according to the noise decibels and the volume range by acts of: for each in-vehicle scenario, calculating a difference between the noise decibel when obtaining a piece of recorded sample audios by recording the sample reference audio and a volume value of normally speaking by the sample user; determining the difference as a signal-to-noise ratio in the in-vehicle scenario; and performing a statistic on signal-to-noise ratios obtained in the in-vehicle scenarios to obtain the signal-to-noise ratio distribution. 10. The electronic device according to claim 7 , wherein the processor is further configured to, before the noise data is superimposed to the pure voice as the noisy voice: superimpose an in-vehicle impulse response and vehicle's noise to the pure voice in turn to obtain a first voice; wherein the processor is configured to superimpose the noise data to the pure voice as the noisy voice by acts of: superimposing the noise data to the first voice to obtain the noisy voice. 11. The electronic device according to claim 7 , wherein the processor is further configured to, before the noisy voice and the reference audio are inputted to the AEC module as the inputted data, acquire a target time corresponding to a starting time of the noise data from time information of the recorded reference audio; and select a partial reference audio having the same time duration with the noise data from the reference audio according to the target time, wherein the processor is configure

Assignees

Inventors

Classifications

  • G10L15/063Primary

    Training · CPC title

  • Noise filtering · CPC title

  • the noise being echo, reverberation of the speech · CPC title

  • for discriminating voice from noise · CPC title

  • G10L25/03Primary

    characterised by the type of extracted parameters · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11017799B2 cover?
The present disclosure discloses a method for processing a voice in interior environment of a vehicle, an electronic device and a storage medium. The method includes the following. A reference audio is acquired, and the reference audio is recorded to obtain a recorded audio. A pure voice is acquired. Noise data for each part or period of the recorded audio satisfying a target signal-to-noise ra…
Who is the assignee on this patent?
Beijing Baidu Netcom Sci & Tec
What technology area does this patent fall under?
Primary CPC classification G10L15/063. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 25 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).