Target sound enhancement device, noise estimation parameter learning device, target sound enhancement method, noise estimation parameter learning method, and program
US-2020388298-A1 · Dec 10, 2020 · US
US2022052751A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2022052751-A1 |
| Application number | US-202117496566-A |
| Country | US |
| Kind code | A1 |
| Filing date | Oct 7, 2021 |
| Priority date | Apr 10, 2019 |
| Publication date | Feb 17, 2022 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The disclosure relates to an audio processing apparatus for localizing an audio source. The audio processing apparatus comprises a plurality of audio sensors, including a primary audio sensor and at least two secondary audio sensors, configured to detect an audio signal from a target audio source, wherein the primary audio sensor defines at least two pairs of audio sensors with the at least two secondary audio sensors; and processing circuitry configured to: determine for each pair of audio sensors a first set of likelihoods of spatial directions of the target audio source using a first localization scheme; determine a second set of likelihoods of spatial directions of the target audio source using a second localization scheme; and determine a third set of likelihoods of spatial directions of the target audio source on the basis of the first sets of likelihoods and the second set of likelihoods.
Opening claim text (preview).
What is claimed is: 1 . An audio processing apparatus, comprising: a plurality of audio sensors, including a primary audio sensor and at least two secondary audio sensors, configured to detect an audio signal from a target audio source, wherein the primary audio sensor defines at least two pairs of audio sensors with the at least two secondary audio sensors; and processing circuitry configured to: determine for each of the at least two pairs of audio sensors a first set of likelihoods of spatial directions of the target audio source using a first localization scheme; determine a second set of likelihoods of spatial directions of the target audio source using a second localization scheme; and determine a third set of likelihoods of spatial directions of the target audio source on the basis of the first sets of likelihoods of spatial directions and the second set of likelihoods of spatial directions. 2 . The audio processing apparatus of claim 1 , wherein the processing circuitry is further configured to determine a current spatial direction of the target audio source on the basis of the third set of likelihoods by determining the most likely spatial direction defined by the third set of likelihoods of spatial directions of the target audio source. 3 . The audio processing apparatus of claim 1 , wherein the plurality of audio sensors are further configured to detect a further audio signal from at least one further audio source and wherein the processing circuitry is configured to separate the audio signal of the target audio source from the further audio signal of the further audio source using a blind source separation scheme. 4 . The audio processing apparatus of claim 3 , wherein the processing circuitry is configured to separate the audio signal of the target audio source from the further audio signal of the further audio source using a geometrically constrained triple-n independent component analysis for convolutive mixtures, GC-TRINICON, scheme based on a geometric constraint, wherein the processing circuitry is configured to determine the geometric constraint on the basis of the first sets of likelihoods and the second set of likelihoods and/or the current spatial direction of the target audio source. 5 . The audio processing apparatus of claim 3 , wherein the processing circuitry is further configured to apply a post filter to the audio signal of the target audio source separated from the further audio signal of the further audio source, wherein the post filter is a coherent-to-diffuse power ratio based post filter based on a target coherence model and/or a noise coherence model wherein the processing circuitry is configured to determine the target coherence model and/or the noise coherence model on the basis of the first sets of likelihoods and the second set of likelihoods and/or the current spatial direction of the target audio source. 6 . The audio processing apparatus of claim 1 , wherein the first localization scheme is a localization scheme based on a geometrically constrained triple-n independent component analysis for convolutive mixtures, GC-TRINICON, scheme. 7 . The audio processing apparatus of claim 1 , wherein the second localization scheme is a steered-response power phase transform, SRP-PHAT, scheme. 8 . The audio processing apparatus of claim 1 , wherein for determining the third set of likelihoods the processing circuitry is configured to determine for each of the at least two pairs of audio sensors a set of similarity weights on the basis of the first set of likelihoods of the respective pair of audio sensors and the second set of likelihoods, wherein each similarity weight represents a similarity measure value between the respective first set of likelihoods and the second set of likelihoods in a respective spatial direction and neighbouring spatial directions thereof. 9 . The audio processing apparatus of claim 8 , wherein the processing circuitry is configured to determine for a respective pair of audio sensors the respective similarity measure value between the respective first set of likelihoods and the second set of likelihoods in a respective spatial direction and neighbouring spatial directions thereof using a spatial filter centered on the respective spatial direction. 10 . The audio processing apparatus of claim 8 , wherein for determining the third set of likelihoods the processing circuitry is further configured for each of the at least two pairs of audio sensors to weight the likelihoods of the respective first set of likelihoods with the respective set of similarity weights for obtaining a respective first set of weighted likelihoods. 11 . The audio processing apparatus of claim 10 , wherein for determining the third set of likelihoods the processing circuitry is further configured to combine the first sets of weighted likelihoods of all of the at least two pairs of audio sensors. 12 . The audio processing apparatus of claim 11 , wherein the processing circuitry is configured to combine the first sets of weighted likelihoods of all of the at least two pairs of audio sensors by determining a sum of the first sets of weighted likelihoods of all of the at least two pairs of audio sensors or a product of the first sets of weighted likelihoods of all of the at least two pairs of audio sensors. 13 . The audio processing apparatus of claim 1 , wherein the processing circuitry is configured to determine for each of the at least two pairs of audio sensors the first set of likelihoods as a first direction-of-arrival, DOA, likelihood vector having a plurality of components and the second set of likelihoods as a second DOA likelihood vector having a plurality of components, wherein the components of the first DOA likelihood vector are defined by the respective value of an averaged directivity pattern, ADP, localization function at a plurality of sampled directions and wherein the components of the second DOA likelihood vector are defined by the respective value of a further localization function at the plurality of sampled directions. 14 . An audio processing method, comprising: detecting an audio signal from a target audio source by a plurality of audio sensors, including a primary audio sensor and at least two secondary audio sensors, wherein the primary audio sensor defines at least two pairs of audio sensors with the at least two secondary audio sensors; determining for each of the at least two pairs of audio sensors a first set of likelihoods of spatial directions of the target audio source using a first localization scheme; determining a second set of likelihoods of spatial directions of the target audio source using a second localization scheme; and determining a third set of likelihoods of spatial directions of the target audio source on the basis of the first sets of likelihoods and the second set of likelihoods. 15 . A non-transitory computer-readable storage medium storing program code which causes a computer or a processor to perform the method of claim 14 when the program code is executed by the computer or the processor.
using weights depending on external parameters, e.g. direction of arrival [DOA], predetermined weights or beamforming · CPC title
using best eigenmode, e.g. beam forming or beam steering · CPC title
using multiple eigenmodes · CPC title
Direction finding using differential microphone array [DMA] · CPC title
Circuits for combining signals of a plurality of transducers · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.