Coding of a sound field signal
US-9502046-B2 · Nov 22, 2016 · US
US2016302005A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2016302005-A1 |
| Application number | US-201615091315-A |
| Country | US |
| Kind code | A1 |
| Filing date | Apr 5, 2016 |
| Priority date | Apr 10, 2015 |
| Publication date | Oct 13, 2016 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method and apparatus are provided for processing data for estimating mixing parameters of at least one audio spot signal captured by a sound recording device, called a spot microphone, arranged in the vicinity of a source among a plurality of acoustic sources constituting a sound scene, and a primary audio signal captured by an ambisonic sound recording device, arranged to capture said plurality of acoustic sources of the sound scene.
Opening claim text (preview).
1 . A method comprising: processing data the estimation of mixing parameters of at least one spot audio signal captured by a sound recording device, so-called spot microphone, arranged in the vicinity of a source among a plurality of acoustic sources constituting a sound scene, and a primary audio signal captured by a sound recording device, arranged to capture said plurality of acoustic sources of the sound scene, said primary audio signal being encoded in a format called “ambisonic”, comprising at least one omnidirectional component (W) and three bidirectional components (X, Y, Z) projected along orthogonal axes of a referential of the primary microphone, wherein said processing comprises the following acts, implemented for a frame of the primary audio signal and a frame of said spot signal, a frame comprising at least one block of N samples: estimating (E 2 ) a delay between the omnidirectional component of the frame of the primary audio signal and the frame of said spot signal, from at least one block of N samples of one of the two frames, so-called block of reference (BRef I ), associated with predetermined moment of acquisition (TI), and an observation area (ZObs i ) of the other frame, so-called observation area, including at least one block of N samples and formed in proximity of the moment of acquisition, by maximizing a measurement of similarity between the block of reference and a block of the observation area, so-called block of observation (BObs i ), temporally offset by the delay (τ) in relation to the block of reference; and estimating (E 3 ) at least one angular position of the source captured by said spot microphone in a referential of the primary microphone by calculation of ratio between a first scalar material of a block of the audio spot signal associated with the predetermined moment of acquisition and a first component of the block of the primary audio signal temporally offset by the estimated delay (τ) and a second scalar material of the same block of audio spot signal and the block of a second component of the primary audio signal temporally offset by the estimated delay (τ). 2 . The method according to claim 1 , wherein, the block of reference (BRef i ) being chosen in the audio spot signal, the stage of estimating the delay comprises a calculation of a similarity measurement at least for the block of reference (BRef I ), from a normalized cross-correlation function (C i ) which is expressed in the following way: C i ( τ ) = 〈 a n | W 〉 - τ || a n || · || W || - τ with W(t) omnidirectional component of the ambisonic signal, a n (t) spot signal, x|y −τ = 0 x|y −τ the scalar product between the two finite support signals temporally offset by −τ, in the observation area is associated with the block of reference (BRef I ), and ∥x∥ τ =√{square root over ( τ x|x τ )}, the norm of a discrete finite support signal; and in that the delay (τ) is estimated from the maximum value of the similarity measurement calculated: {tilde over (τ)}=Argmax τ C i (τ). 3 . The method according to claim 2 wherein the stage of estimating the delay also comprises a temporal smoothing of the similarity measurement calculated for the current block of reference (BRef i ) taking into account the similarity measurement calculated for at least one previous block of reference (BRef I−1 ). 4 . The method according to claim 1 , wherein the estimation of an angular position of the captured source comprises the estimation of an angular position of the captured source comprising the estimation of an azimuth angle ({tilde over (θ)} n ) from a ratio between the scalar material of the signal of the block of reference associated with the predetermined moment of acquisition with the block component Y of the primary audio signal offset by the estimated delay and the scalar product of the signal of the block of reference associated with the predetermined moment of acquisition with the block component X of the primary audio signal offset by the estimated delay. 5 . The method according to claim 4 , wherein the azimuth angle is estimated from the following equation: {tilde over (θ)} n =a tan 2( a n |Y −τ , a n |X −τ ) 6 . The method according to claim 1 , wherein the estimation of an angular position comprises the estimation of an elevation angle from a ratio between the scalar product of the block of reference of the audio spot signal associated with the moment of acquisition with the block component Z of the primary audio signal offset by the estimated delay (τ) and the scalar material of the block of the audio spot signal associated with the moment of acquisition with the block of omnidirectional component of the primary signal offset by the estimated delay (τ). 7 . The method according to claim 6 , wherein the angle of elevation ({tilde over (φ)} n ) is estimated from the following equation: ϕ ~ n = arcsin ( 〈 a n | Z 〉 - τ ~ η · 〈 a n | W 〉 - τ ~ )
in which the audio signals are in digital form, i.e. employing more than two discrete digital channels (data reduction aspects thereof based on psychoacoustics G10L19/02) · CPC title
Microphone arrays · CPC title
Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing · CPC title
Correlation function computation {including computation of convolution operations (arithmetic circuits for sum of products per se, e.g. multiply-accumulators G06F7/5443; digital filters, e.g. FIR, IIR, adaptive filters H03H17/00)} · CPC title
Automatic calibration of stereophonic sound system, e.g. with test microphone · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.