Audio adaptation to room
US-2018352331-A1 · Dec 6, 2018 · US
US10650840B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-10650840-B1 |
| Application number | US-201816032494-A |
| Country | US |
| Kind code | B1 |
| Filing date | Jul 11, 2018 |
| Priority date | Jul 11, 2018 |
| Publication date | May 12, 2020 |
| Grant date | May 12, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A device that determines an echo latency estimate by subsampling reference audio data. The device may determine the echo latency corresponding to an amount of time between sending reference audio data to loudspeaker(s) and microphone audio data corresponding to the reference audio data being received. The device may generate subsampled reference audio data by selecting only portions of the reference audio data that have a magnitude above a desired percentile. For example, the device may compare a magnitude of an individual reference audio sample to a percentile estimate value and sample only the reference audio samples that exceed the percentile estimate value. The device generate cross-correlation data between the subsampled reference audio data and the microphone audio data and may estimate the echo latency based on an earliest significant peak represented in the cross-correlation data.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method, the method comprising: sending reference audio data to a loudspeaker to generate output audio, the reference audio data including a first reference sample and a second reference sample; capturing microphone audio data using a microphone, the microphone audio data including a first representation of at least a portion of the output audio; calculating a first value of a threshold, the first value corresponding to a 99th percentile of the reference audio data during a first time period; determining a first magnitude value corresponding to the first reference sample; determining that the first magnitude value is below the first value, indicating that the first reference sample is below the 99th percentile; calculating a second value of the threshold that is lower than the first value, the second value indicating the 99th percentile of the reference audio data during a second time period; determining a second magnitude value corresponding to the second reference sample; determining that the second magnitude value exceeds the second value, indicating that the second reference sample is at or above the 99th percentile; generating subsampled reference audio data including the second reference sample and corresponding to portions of the reference audio data at or above the 99th percentile; determining cross-correlation data corresponding to a cross-correlation between the subsampled reference audio data and the microphone audio data; determining a first peak value in the cross-correlation data, the first peak value indicating a beginning of the first representation; determining, using the first peak value, an echo delay estimate value corresponding to a delay between sending the reference audio data to the loudspeaker and the microphone capturing the first representation in the microphone audio data; determining second reference audio data using the reference audio data and the echo delay estimate value, the second reference audio data synchronized with the microphone audio data; and subtracting the second reference audio data from the microphone audio data to generate output audio data. 2. The computer-implemented method of claim 1 , wherein: determining the echo delay estimate value further comprises: determining a third time period associated with the first peak value, and determining the echo delay estimate value based on a difference between the third time period and a fourth time period at which the reference audio data was sent to the loudspeaker, the echo delay estimate value corresponding to a first echo path; and the method further comprises: determining a second peak value represented in the cross-correlation data after the first peak value; determining a fifth time period associated with the second peak value, the fifth time period after the third time period; determining a second echo delay estimate value based on a difference between the fifth time period and the fourth time period, the second echo delay estimate value corresponding to a second echo path; and determining the second reference audio data further comprises determining the second reference audio data based on the reference audio data, the echo delay estimate value, and the second echo delay estimate value. 3. The computer-implemented method of claim 1 , further comprising: calculating the second value of the threshold by subtracting a first amount from the first value; and calculating, in response to the second magnitude value exceeding the second value, a third value of the threshold by adding a second amount to the second value, the third value indicating the 99th percentile of the reference audio data during a third time period after the second time period, wherein: the 99th percentile corresponds to a first number having a value of 0.99; a complement of the 99th percentile corresponds to a second number having a value of 0.01; the second amount corresponds to a first product of the first number and a coefficient value; and the first amount corresponds to a second product of the second number and the coefficient value. 4. The computer-implemented method of claim 1 , further comprising: calculating the second value of the threshold by subtracting a first amount from the first value; calculating, in response to the second magnitude value exceeding the second value, a third value of the threshold by adding a second amount to the second value, the third value indicating the 99th percentile of the reference audio data during a third time period after the second time period; calculating a fourth value of the threshold during a fourth time period, the fourth time period corresponding to a steady state condition; determining a third magnitude value corresponding to a third reference sample; determining that the third magnitude value exceeds the fourth value, indicating that the third reference sample is at or above the 99th percentile; and calculating a fifth value of the threshold by adding a third amount to the fourth value, the fifth value indicating the 99th percentile of the reference audio data during a fifth time period after the fourth time period. 5. A computer-implemented method, the method comprising: receiving reference audio data corresponding to output audio generated by at least one loudspeaker, the reference audio data including a first sample and a second sample; receiving microphone audio data from at least one microphone, the microphone audio data including a representation of the output audio; determining a first magnitude value based on the first sample; determining that the first magnitude value is below a desired percentile associated with the reference audio data; determining a second magnitude value based on the second sample; determining that the second magnitude value is at or above the desired percentile associated with the reference audio data; generating subsampled reference audio data including the second sample and corresponding to portions of the reference audio data that are at or above the desired percentile; and determining an echo delay estimate value based on the subsampled reference audio data and the microphone audio data. 6. The computer-implemented method of claim 5 , further comprising: determining second reference audio data based on the reference audio data and the echo delay estimate value, the second reference audio data synchronized with the microphone audio data; and generating output audio data by subtracting at least a portion of the second reference audio data from the microphone audio data. 7. The computer-implemented method of claim 5 , wherein: determining that the first magnitude value is below the desired percentile further comprises: determining a first estimate value of the desired percentile during a first time period, and determining that the first magnitude value is below the first estimate value; and the method further comprises determining a second estimate value by subtracting a first amount from the first estimate value, the second estimate value corresponding to the desired percentile during a second time period after the first time period. 8. The computer-implemented method of claim 7 , wherein: determining that the second magnitude value is at or above the desired percentile further comprises determining that the second magnitude value exceeds the second estimate value; the method further comprises determining a third estimate value by adding a second amount to the second magnitude value, the third estimate value corresponding to the desired percentile during a third time period after the second time period; and generating the subsampled reference audio data further comprises adding the second sa
the noise being echo, reverberation of the speech · CPC title
the noise being separate speech, e.g. cocktail party · CPC title
Microphone arrays; Beamforming · CPC title
Only one microphone · CPC title
characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.