Automatic loudspeaker directivity adaptation
US-2024236597-A1 · Jul 11, 2024 · US
US2020366994A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2020366994-A1 |
| Application number | US-202016987197-A |
| Country | US |
| Kind code | A1 |
| Filing date | Aug 6, 2020 |
| Priority date | Sep 29, 2016 |
| Publication date | Nov 19, 2020 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Embodiments are described for a method of simultaneously localizing a set of speakers and microphones, having only the times of arrival between each of the speakers and microphones. An autodiscovery process uses an external input to set: a global translation (3 continuous parameters), a global rotation (3 continuous parameters), and discrete symmetries, i.e., an exchange of any axis pairs and/or reversal of any axis. Different time of arrival acquisition techniques may be used, such as ultrasonic sweeps or generic multitrack audio content. The autodiscovery algorithm is based in minimizing a certain cost function, and the process allows for latencies in the recordings, possibly linked to the latencies in the emission.
Opening claim text (preview).
1 . A method for localizing speakers in a listening environment having a plurality of speakers and microphones, comprising: receiving one or more respective times of arrival (TOA) for each speaker of the plurality of speakers to each microphone of the plurality of microphones to generate multiple TOA candidates, wherein each microphone is proximate a single respective speaker; receiving configuration parameters of the listening environment; minimizing a cost function using each of the one or more respective TOA values of each speaker to estimate a position and latency of a respective speaker and microphone; iterating the cost function minimization over each TOA candidate of the multiple TOA candidates; and using the configuration parameters and minimized cost function to provide speaker location information to one or more post-processing or audio rendering components. 2 . The method of claim 1 wherein each microphone is placed inside, on top of, or attached to a speaker cabinet of the single respective speaker, and further wherein the received TOA include multiple TOA candidates for at least one of the speakers to at least one of the microphones. 3 . The method of claim 1 , comprising: estimating an impulse (IR) of the listening environment based on a reference audio sequence played back by one or more of the speakers and a recording of the reference audio sequence obtained from one or more of the microphones; and using the IR to search for direct sound candidate peaks, wherein the multiple TOA candidates correspond to respective candidate peaks identified in the search, wherein the speaker location information provided to one or more post-processing or audio rendering components is based on a selection among the TOA candidates for which a residual of the minimizing step is below a certain threshold value. 4 . The method of claim 1 , comprising: estimating an impulse response (IR) of the listening environment by one of: cross-correlating a known reference audio sequence to a recording of the sequence obtained from the microphones to derive a pseudo-impulse response, or deconvolving a calibration audio sequence and a recording of the calibration audio sequence obtained from the microphones; using the IR to search for direct sound candidate peaks by evaluating a reference peak and using noise levels around the reference peak, wherein the multiple TOA candidates correspond to respective candidate peaks identified in the search; and performing a multiple peak evaluation by selecting an initial TOA matrix, evaluating the initial matrix with residuals of the minimizing step, and changing TOA matrix elements until the residuals are below a defined threshold value. 5 . The method of claim 4 , wherein using the IR to search for direct sound candidate peaks includes: searching for alternative peaks at least in a portion of the IR located before the reference peak. 6 . The method of claim 1 wherein the latency comprises a playback latency for at least one speaker. 7 . The method of claim 1 , wherein the latency comprises a recording latency for at least one microphone. 8 . The method of claim 1 , wherein the configuration parameters comprise at least one of: the number of speakers and microphones, a size of the listening environment; bounds on the playback and recording latencies; a specification of two-dimensional or three-dimensional speaker location; constraints on speaker and microphone relative positioning; constraints on speaker and microphone relative latencies; and references to disambiguate rotation, translation and axes inversion symmetries. 9 . The method of the claim 1 further comprising providing a seed layout to the cost function, the seed layout specifying the correct number of speakers and microphones in defined initial positions relative to a defined speaker layout standard. 10 . The method of claim 9 further comprising transforming the estimated location information into a canonical format based on a configuration of the speakers in the listening environment. 11 . The method of claim 1 wherein the speakers in the listening environment are placed in a surround-sound configuration having a plurality of front, rear and surround speakers and one or more low frequency effect speakers, and wherein at least some speakers are height speakers providing playback of height cues present in an input audio signal comprising immersive audio content. 12 . The method of claim 1 wherein obtaining the one or more respective TOA values may be performed using at least one of: a room calibration audio sequence emitted sequentially by each of the speakers and recorded simultaneously by the microphones; a calibration audio sequence band-limited to the close ultrasonic range, such as 18 to 24 kHz; an arbitrary multichannel audio sequence; and a specifically defined multichannel audio sequence, to recover a room impulse response from a multichannel audio sequence. 13 . The method of claim 12 further comprising using the estimated speaker location information to modify a rendering process transmitting speaker feeds to each speaker, and wherein the listening environment comprises one of a large venue playing cinema content, or a home theater, and wherein at least some of the speakers comprise wireless speakers coupled to a renderer executing the rendering process over a wireless data network. 14 . The method of claim 1 further comprising: estimating an impulse response (IR) of the listening environment by one of: cross-correlating a known reference audio sequence to a recording of the sequence obtained from the microphones to derive a pseudo-impulse response, or deconvolving a calibration audio sequence and a recorded audio program; and estimating one or more best TOA candidates from at least one of the estimated IR or pseudo-IR using an iterative peak-searching algorithm. 15 . The method of claim 1 further comprising: using residual values of the minimizing step to provide an estimate of the internal coherence of the original TOA values; and generating an error estimate to allow for iterating over the cost function minimization process to improve the estimated location. 16 . The method of claim 1 wherein the TOA values are formatted into a matrix of dimension n by n, where n is the number of the speakers and co-located microphones. 17 . The method of claim 1 wherein the step of receiving the TOA values for each speaker each of microphone using the multiple TOA candidates comprises: deconvolving a calibration audio sequence sent to each speaker to obtain a room impulse response (IR); using the IR to search for direct sound candidate peaks by evaluating a reference peak and using noise levels around the reference peak; and performing a multiple peak evaluation by selecting an initial TOA matrix, evaluating the initial matrix with residuals of the minimizing step, and changing TOA matrix elements until the residuals are below a defined threshold value. 18 . The method of claim 17 wherein the minimizing step is performed using a nonlinear minimization algorithm using an Interior Point Optimize software library in an executable software program. 19 . The method of claim 17 further comprising explicitly providing explicit first derivatives (Jacobian) and second derivatives (Hessian) of the cost functions and constraints with respect to unknowns of the cost function. 20 . A system for determining locations of a plurality of speakers in a room, comprising: a microphone placed proximate each
Spatial or constructional arrangements of loudspeakers · CPC title
Electronic adaptation of stereophonic audio signals to reverberation of the listening space (H04S7/301 takes precedence) · CPC title
Circuit arrangements, {e.g. for selective connection of amplifier inputs/outputs to loudspeakers, for loudspeaker detection, or for adaptation of settings to personal preferences or hearing impairments (combinations of amplifiers H03F3/68; stereophonic systems H04S)} · CPC title
for combining the signals of two or more microphones (specially adapted for hearing aids H04R25/407) · CPC title
Application of parametric coding in stereophonic audio systems · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.