Beam rejection in multi-beam microphone systems
US-9689960-B1 · Jun 27, 2017 · US
US11425495B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-11425495-B1 |
| Application number | US-202117234233-A |
| Country | US |
| Kind code | B1 |
| Filing date | Apr 19, 2021 |
| Priority date | Apr 19, 2021 |
| Publication date | Aug 23, 2022 |
| Grant date | Aug 23, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A system that performs sound source localization (SSL) using acoustic wave decomposition (AWD) or an approximation. When a device detects a wakeword represented in audio data, the device performs SSL processing in order to determine a position of the user relative to the device (e.g., estimate angle of the user). The device calculates noise statistics based on first audio data representing the wakeword and second audio data preceding the wakeword. Thus, upon detecting the wakeword, the device calculates the noise statistics and a signal quality metric corresponding to the wakeword. In addition, the device uses Multi-Channel Linear Prediction Coding (MCLPC) coefficients to average out the room impulse response. Using the noise statistics, the MCLPC coefficients, and the audio data, the device performs AWD processing to decompose the sound field to disjoint acoustic plane waves, enabling the device to identify the most likely direction for the line-of-sight component of speech.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method, the method comprising: receiving first audio data, a first portion of the first audio data corresponding to a first microphone of a device and a second portion of the first audio data corresponding to a second microphone of the device; determining first coefficient data associated with the first audio data, the first coefficient data corresponding to the first microphone and the second microphone; detecting speech represented during a first period of time within the first audio data, the speech generated by a user; determining first energy data associated with a second period of time within the first audio data, the second period of time preceding the first period of time; determining, using the first audio data, first weight data; determining, using the first coefficient data, second weight data; and determining, using the first weight data, the second weight data, and the first energy data, that the user is in a first direction relative to the device. 2. The computer-implemented method of claim 1 , wherein determining that the user is in the first direction further comprises: determining first signal quality metric data using the first energy data and second energy data, the second energy data associated with a first portion of the first period of time; and generating, using the first weight data and the first signal quality metric data, first data, the first data indicating that the first direction corresponds to a first local maxima of a first function. 3. The computer-implemented method of claim 2 , wherein determining that the user is in the first direction further comprises: determining second signal quality metric data using the first energy data and third energy data, the third energy data associated with a second portion of the first period of time; generating, using the first weight data and the second signal quality metric data, second data, the second data indicating that a second direction corresponds to a second local maxima of a second function; and determining, based on the first data and the second data, that the user is in the first direction. 4. The computer-implemented method of claim 1 , wherein determining that the user is in the first direction further comprises: determining first signal quality metric data using the first energy data and second energy data, the second energy data associated with the first period of time; generating, using the first weight data and the first signal quality metric data, first data, the first data indicating that the first direction corresponds to a first local maxima of a first function; determining first variance data corresponding to the first data; and determining, based on the first data and the first variance data, that the user is in the first direction. 5. The computer-implemented method of claim 4 , wherein determining that the user is in the first direction further comprises: generating, using the second weight data and the first signal quality metric data, second data, the second data indicating that a second direction corresponds to a second local maxima of a second function; determining second variance data corresponding to the second data; and determining, using the first data, the first variance data, the second data, and the second variance data, that the user is in the first direction. 6. The computer-implemented method of claim 1 , further comprises: determining that a beginning of the first period of time corresponds to a beginning of the speech; determining second energy data associated with the first period of time; and determining signal quality metric data using the first energy data and the second energy data. 7. The computer-implemented method of claim 1 , further comprising: determining first signal quality metric data using the first energy data and second energy data, the second energy data associated with the first period of time; generating, using the second weight data and the first signal quality metric data, first data, the first data including a first mean value and a first variance value; determining a first signal quality metric value using the first signal quality metric data; determining that the first signal quality metric value is below a threshold value; determining a second variance value by multiplying the first variance value by a first value; and determining, based on the first mean value and the second variance value, that the user is in the first direction. 8. The computer-implemented method of claim 1 , further comprising: receiving image data from a camera associated with the device; detecting an object represented in the image data, the object being in a second direction relative to the device; generating a weighting vector that associates the second direction with a first value and remaining directions with a second value; and determining, based on the first weight data, the second weight data, the first energy data, and the weighting vector, that the user is in the first direction relative to the device. 9. The computer-implemented method of claim 1 , further comprising: receiving first sensor data indicating that the device is in a first orientation; determining first acoustic characteristics data corresponding to the first orientation; determining the first weight data using the first acoustic characteristics data and the first audio data, the first weight data associated with a first portion of the first period of time; receiving second sensor data indicating that the device is in a second orientation; determining second acoustic characteristics data corresponding to the second orientation; determining third weight data using the second acoustic characteristics data and the first audio data, the third weight data associated with a second portion of the first period of time; and determining, using the third weight data, that the user is in a second direction relative to the device during the second portion of the first period of time. 10. A system comprising: at least one processor; and memory including instructions operable to be executed by the at least one processor to cause the system to: receive first audio data, a first portion of the first audio data corresponding to a first microphone of a device and a second portion of the first audio data corresponding to a second microphone of the device; determine first coefficient data associated with the first audio data, the first coefficient data corresponding to the first microphone and the second microphone; detect speech represented during a first period of time within the first audio data, the speech generated by a user; determine first energy data associated with a second period of time within the first audio data, the second period of time preceding the first period of time; determine, using the first audio data, first weight data; determine, using the first coefficient data, second weight data; and determine, using the first weight data, the second weight data, and the first energy data, that the user is in a first direction relative to the device. 11. The system of claim 10 , wherein the memory further comprises instructions that, when executed by the at least one processor, further cause the system to: determine first signal quality metric data using the first energy data and second energy data, the second energy data associated with a first portion of the first period of time; and generate, using the first weight data and the first signal quality metric data, first data, the first data indicating that the first direction corresponds to a first local maxima of a first function. 1
Hearing devices using active noise cancellation · CPC title
Processing of the output signals of the acoustic transducers of an array for obtaining a desired directivity characteristic (H04R2203/12 takes precedence) · CPC title
Communication between hearing aids and external devices via a network for data exchange · CPC title
microphones · CPC title
Use of position data from wide-area or local-area positioning systems in hearing devices, e.g. program or information selection · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.