Audio processing apparatus and method for localizing an audio source

US2022052751A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2022052751-A1
Application numberUS-202117496566-A
CountryUS
Kind codeA1
Filing dateOct 7, 2021
Priority dateApr 10, 2019
Publication dateFeb 17, 2022
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The disclosure relates to an audio processing apparatus for localizing an audio source. The audio processing apparatus comprises a plurality of audio sensors, including a primary audio sensor and at least two secondary audio sensors, configured to detect an audio signal from a target audio source, wherein the primary audio sensor defines at least two pairs of audio sensors with the at least two secondary audio sensors; and processing circuitry configured to: determine for each pair of audio sensors a first set of likelihoods of spatial directions of the target audio source using a first localization scheme; determine a second set of likelihoods of spatial directions of the target audio source using a second localization scheme; and determine a third set of likelihoods of spatial directions of the target audio source on the basis of the first sets of likelihoods and the second set of likelihoods.

First claim

Opening claim text (preview).

What is claimed is: 1 . An audio processing apparatus, comprising: a plurality of audio sensors, including a primary audio sensor and at least two secondary audio sensors, configured to detect an audio signal from a target audio source, wherein the primary audio sensor defines at least two pairs of audio sensors with the at least two secondary audio sensors; and processing circuitry configured to: determine for each of the at least two pairs of audio sensors a first set of likelihoods of spatial directions of the target audio source using a first localization scheme; determine a second set of likelihoods of spatial directions of the target audio source using a second localization scheme; and determine a third set of likelihoods of spatial directions of the target audio source on the basis of the first sets of likelihoods of spatial directions and the second set of likelihoods of spatial directions. 2 . The audio processing apparatus of claim 1 , wherein the processing circuitry is further configured to determine a current spatial direction of the target audio source on the basis of the third set of likelihoods by determining the most likely spatial direction defined by the third set of likelihoods of spatial directions of the target audio source. 3 . The audio processing apparatus of claim 1 , wherein the plurality of audio sensors are further configured to detect a further audio signal from at least one further audio source and wherein the processing circuitry is configured to separate the audio signal of the target audio source from the further audio signal of the further audio source using a blind source separation scheme. 4 . The audio processing apparatus of claim 3 , wherein the processing circuitry is configured to separate the audio signal of the target audio source from the further audio signal of the further audio source using a geometrically constrained triple-n independent component analysis for convolutive mixtures, GC-TRINICON, scheme based on a geometric constraint, wherein the processing circuitry is configured to determine the geometric constraint on the basis of the first sets of likelihoods and the second set of likelihoods and/or the current spatial direction of the target audio source. 5 . The audio processing apparatus of claim 3 , wherein the processing circuitry is further configured to apply a post filter to the audio signal of the target audio source separated from the further audio signal of the further audio source, wherein the post filter is a coherent-to-diffuse power ratio based post filter based on a target coherence model and/or a noise coherence model wherein the processing circuitry is configured to determine the target coherence model and/or the noise coherence model on the basis of the first sets of likelihoods and the second set of likelihoods and/or the current spatial direction of the target audio source. 6 . The audio processing apparatus of claim 1 , wherein the first localization scheme is a localization scheme based on a geometrically constrained triple-n independent component analysis for convolutive mixtures, GC-TRINICON, scheme. 7 . The audio processing apparatus of claim 1 , wherein the second localization scheme is a steered-response power phase transform, SRP-PHAT, scheme. 8 . The audio processing apparatus of claim 1 , wherein for determining the third set of likelihoods the processing circuitry is configured to determine for each of the at least two pairs of audio sensors a set of similarity weights on the basis of the first set of likelihoods of the respective pair of audio sensors and the second set of likelihoods, wherein each similarity weight represents a similarity measure value between the respective first set of likelihoods and the second set of likelihoods in a respective spatial direction and neighbouring spatial directions thereof. 9 . The audio processing apparatus of claim 8 , wherein the processing circuitry is configured to determine for a respective pair of audio sensors the respective similarity measure value between the respective first set of likelihoods and the second set of likelihoods in a respective spatial direction and neighbouring spatial directions thereof using a spatial filter centered on the respective spatial direction. 10 . The audio processing apparatus of claim 8 , wherein for determining the third set of likelihoods the processing circuitry is further configured for each of the at least two pairs of audio sensors to weight the likelihoods of the respective first set of likelihoods with the respective set of similarity weights for obtaining a respective first set of weighted likelihoods. 11 . The audio processing apparatus of claim 10 , wherein for determining the third set of likelihoods the processing circuitry is further configured to combine the first sets of weighted likelihoods of all of the at least two pairs of audio sensors. 12 . The audio processing apparatus of claim 11 , wherein the processing circuitry is configured to combine the first sets of weighted likelihoods of all of the at least two pairs of audio sensors by determining a sum of the first sets of weighted likelihoods of all of the at least two pairs of audio sensors or a product of the first sets of weighted likelihoods of all of the at least two pairs of audio sensors. 13 . The audio processing apparatus of claim 1 , wherein the processing circuitry is configured to determine for each of the at least two pairs of audio sensors the first set of likelihoods as a first direction-of-arrival, DOA, likelihood vector having a plurality of components and the second set of likelihoods as a second DOA likelihood vector having a plurality of components, wherein the components of the first DOA likelihood vector are defined by the respective value of an averaged directivity pattern, ADP, localization function at a plurality of sampled directions and wherein the components of the second DOA likelihood vector are defined by the respective value of a further localization function at the plurality of sampled directions. 14 . An audio processing method, comprising: detecting an audio signal from a target audio source by a plurality of audio sensors, including a primary audio sensor and at least two secondary audio sensors, wherein the primary audio sensor defines at least two pairs of audio sensors with the at least two secondary audio sensors; determining for each of the at least two pairs of audio sensors a first set of likelihoods of spatial directions of the target audio source using a first localization scheme; determining a second set of likelihoods of spatial directions of the target audio source using a second localization scheme; and determining a third set of likelihoods of spatial directions of the target audio source on the basis of the first sets of likelihoods and the second set of likelihoods. 15 . A non-transitory computer-readable storage medium storing program code which causes a computer or a processor to perform the method of claim 14 when the program code is executed by the computer or the processor.

Assignees

Inventors

Classifications

  • H04B7/086Primary

    using weights depending on external parameters, e.g. direction of arrival [DOA], predetermined weights or beamforming · CPC title

  • using best eigenmode, e.g. beam forming or beam steering · CPC title

  • using multiple eigenmodes · CPC title

  • Direction finding using differential microphone array [DMA] · CPC title

  • Circuits for combining signals of a plurality of transducers · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2022052751A1 cover?
The disclosure relates to an audio processing apparatus for localizing an audio source. The audio processing apparatus comprises a plurality of audio sensors, including a primary audio sensor and at least two secondary audio sensors, configured to detect an audio signal from a target audio source, wherein the primary audio sensor defines at least two pairs of audio sensors with the at least two…
Who is the assignee on this patent?
Huawei Tech Co Ltd, Univ Friedrich Alexander Er
What technology area does this patent fall under?
Primary CPC classification H04B7/086. Mapped technology areas include Electricity.
When was this patent published?
Publication date Thu Feb 17 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).