Echo latency estimation

US10650840B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-10650840-B1
Application numberUS-201816032494-A
CountryUS
Kind codeB1
Filing dateJul 11, 2018
Priority dateJul 11, 2018
Publication dateMay 12, 2020
Grant dateMay 12, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A device that determines an echo latency estimate by subsampling reference audio data. The device may determine the echo latency corresponding to an amount of time between sending reference audio data to loudspeaker(s) and microphone audio data corresponding to the reference audio data being received. The device may generate subsampled reference audio data by selecting only portions of the reference audio data that have a magnitude above a desired percentile. For example, the device may compare a magnitude of an individual reference audio sample to a percentile estimate value and sample only the reference audio samples that exceed the percentile estimate value. The device generate cross-correlation data between the subsampled reference audio data and the microphone audio data and may estimate the echo latency based on an earliest significant peak represented in the cross-correlation data.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method, the method comprising: sending reference audio data to a loudspeaker to generate output audio, the reference audio data including a first reference sample and a second reference sample; capturing microphone audio data using a microphone, the microphone audio data including a first representation of at least a portion of the output audio; calculating a first value of a threshold, the first value corresponding to a 99th percentile of the reference audio data during a first time period; determining a first magnitude value corresponding to the first reference sample; determining that the first magnitude value is below the first value, indicating that the first reference sample is below the 99th percentile; calculating a second value of the threshold that is lower than the first value, the second value indicating the 99th percentile of the reference audio data during a second time period; determining a second magnitude value corresponding to the second reference sample; determining that the second magnitude value exceeds the second value, indicating that the second reference sample is at or above the 99th percentile; generating subsampled reference audio data including the second reference sample and corresponding to portions of the reference audio data at or above the 99th percentile; determining cross-correlation data corresponding to a cross-correlation between the subsampled reference audio data and the microphone audio data; determining a first peak value in the cross-correlation data, the first peak value indicating a beginning of the first representation; determining, using the first peak value, an echo delay estimate value corresponding to a delay between sending the reference audio data to the loudspeaker and the microphone capturing the first representation in the microphone audio data; determining second reference audio data using the reference audio data and the echo delay estimate value, the second reference audio data synchronized with the microphone audio data; and subtracting the second reference audio data from the microphone audio data to generate output audio data. 2. The computer-implemented method of claim 1 , wherein: determining the echo delay estimate value further comprises: determining a third time period associated with the first peak value, and determining the echo delay estimate value based on a difference between the third time period and a fourth time period at which the reference audio data was sent to the loudspeaker, the echo delay estimate value corresponding to a first echo path; and the method further comprises: determining a second peak value represented in the cross-correlation data after the first peak value; determining a fifth time period associated with the second peak value, the fifth time period after the third time period; determining a second echo delay estimate value based on a difference between the fifth time period and the fourth time period, the second echo delay estimate value corresponding to a second echo path; and determining the second reference audio data further comprises determining the second reference audio data based on the reference audio data, the echo delay estimate value, and the second echo delay estimate value. 3. The computer-implemented method of claim 1 , further comprising: calculating the second value of the threshold by subtracting a first amount from the first value; and calculating, in response to the second magnitude value exceeding the second value, a third value of the threshold by adding a second amount to the second value, the third value indicating the 99th percentile of the reference audio data during a third time period after the second time period, wherein: the 99th percentile corresponds to a first number having a value of 0.99; a complement of the 99th percentile corresponds to a second number having a value of 0.01; the second amount corresponds to a first product of the first number and a coefficient value; and the first amount corresponds to a second product of the second number and the coefficient value. 4. The computer-implemented method of claim 1 , further comprising: calculating the second value of the threshold by subtracting a first amount from the first value; calculating, in response to the second magnitude value exceeding the second value, a third value of the threshold by adding a second amount to the second value, the third value indicating the 99th percentile of the reference audio data during a third time period after the second time period; calculating a fourth value of the threshold during a fourth time period, the fourth time period corresponding to a steady state condition; determining a third magnitude value corresponding to a third reference sample; determining that the third magnitude value exceeds the fourth value, indicating that the third reference sample is at or above the 99th percentile; and calculating a fifth value of the threshold by adding a third amount to the fourth value, the fifth value indicating the 99th percentile of the reference audio data during a fifth time period after the fourth time period. 5. A computer-implemented method, the method comprising: receiving reference audio data corresponding to output audio generated by at least one loudspeaker, the reference audio data including a first sample and a second sample; receiving microphone audio data from at least one microphone, the microphone audio data including a representation of the output audio; determining a first magnitude value based on the first sample; determining that the first magnitude value is below a desired percentile associated with the reference audio data; determining a second magnitude value based on the second sample; determining that the second magnitude value is at or above the desired percentile associated with the reference audio data; generating subsampled reference audio data including the second sample and corresponding to portions of the reference audio data that are at or above the desired percentile; and determining an echo delay estimate value based on the subsampled reference audio data and the microphone audio data. 6. The computer-implemented method of claim 5 , further comprising: determining second reference audio data based on the reference audio data and the echo delay estimate value, the second reference audio data synchronized with the microphone audio data; and generating output audio data by subtracting at least a portion of the second reference audio data from the microphone audio data. 7. The computer-implemented method of claim 5 , wherein: determining that the first magnitude value is below the desired percentile further comprises: determining a first estimate value of the desired percentile during a first time period, and determining that the first magnitude value is below the first estimate value; and the method further comprises determining a second estimate value by subtracting a first amount from the first estimate value, the second estimate value corresponding to the desired percentile during a second time period after the first time period. 8. The computer-implemented method of claim 7 , wherein: determining that the second magnitude value is at or above the desired percentile further comprises determining that the second magnitude value exceeds the second estimate value; the method further comprises determining a third estimate value by adding a second amount to the second magnitude value, the third estimate value corresponding to the desired percentile during a third time period after the second time period; and generating the subsampled reference audio data further comprises adding the second sa

Assignees

Inventors

Classifications

  • the noise being echo, reverberation of the speech · CPC title

  • the noise being separate speech, e.g. cocktail party · CPC title

  • Microphone arrays; Beamforming · CPC title

  • Only one microphone · CPC title

  • characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10650840B1 cover?
A device that determines an echo latency estimate by subsampling reference audio data. The device may determine the echo latency corresponding to an amount of time between sending reference audio data to loudspeaker(s) and microphone audio data corresponding to the reference audio data being received. The device may generate subsampled reference audio data by selecting only portions of the refe…
Who is the assignee on this patent?
Amazon Tech Inc
What technology area does this patent fall under?
Primary CPC classification G10L21/0264. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 12 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).