What technology area does this patent fall under?

Primary CPC classification H04R3/005. Mapped technology areas include Electricity.

When was this patent published?

Publication date Tue Mar 02 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Robust estimation of sound source localization

US10939201B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10939201-B2
Application number	US-201313775073-A
Country	US
Kind code	B2
Filing date	Feb 22, 2013
Priority date	Feb 22, 2013
Publication date	Mar 2, 2021
Grant date	Mar 2, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for sound source localization in a digital system having at least two audio capture devices is provided that includes receiving audio signals from the two audio capture devices, computing a signal-to-noise ratio (SNR) for each frequency band of a plurality of frequency bands in a processing frame of the audio signals, determining a frequency band weight for each frequency band of the plurality of frequency bands based on the SNR computed for the frequency band, computing an estimated time delay of arrival (TDOA) of sound for the processing frame using the frequency band weights, and converting the estimated TDOA to an angle representing sound direction.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: Receiving, with one or more processors, audio signals from two audio capture devices; Converting, with the one or more processers, the audio signals into a processing frame by: splitting the audio signals into overlapping blocks; applying a windowing function to the overlapping blocks; and storing the offset between the windowed blocks as the processing frame; dividing the audio signals, in the processing frame, into multiple specified continuous frequency bands, each of the frequency bands including multiple frequency components; computing, with the one or more processors, a generalized cross-correlation with phase transform (GCC-PHAT) and a signal-to-noise ratio (SNR) for each of the frequency bands; setting, with the one or more processors, a value of a frequency band weight for a corresponding one of the frequency bands to one when the SNR computed for the corresponding frequency band indicates sufficient signal power in the corresponding frequency band to meet a threshold for contribution to a sound direction estimate; setting, with the one or more processors, the value of the frequency band weight for the corresponding frequency band to zero when the SNR computed for the corresponding frequency band does not indicate sufficient signal power in the corresponding frequency band to meet a threshold for contribution to a sound direction estimate; determining, with the one or more processors, a weighted GCC-PHAT value for each of the frequency bands based on the GCC-PHAT for the respective frequency band and the frequency band weight for the respective frequency band; up-sampling, with the one or more processors, the weighted GCC-PHAT value for each of the frequency bands by inserting zeroes in a spectral representation of the weighted GCC-PHAT value for each of the frequency bands; converting, with the one or more processors, the up-sampled weighted GCC-PHAT value for each of the frequency bands into a time domain; computing, with the one or more processors, an estimated time delay of arrival (TDOA) of sound for the processing frame using the time domain up-sampled weighted GCC-PHAT value for each of the frequency bands; and converting, with the one or more processors, the estimated TDOA to an angle representing sound direction. 2. A method comprising: receiving, with one or more processors, audio signals from two audio capture devices; converting, with the one or more processors, the audio signals into a processing frame by: splitting the audio signals into overlapping blocks; applying a windowing function to the overlapping blocks; and storing the offset between the windowed blocks as the processing frame; dividing the audio signals, in the processing frame, into multiple specified continuous frequency bands, each of the frequency bands including multiple frequency components; computing, with the one or more processors, a generalized cross-correlation with phase transform (GCC-PHAT) and a signal-to-noise ratio (SNR) for each of the frequency bands in the processing frame of the audio signals; determining, with the one or more processors, a frequency band weight for each of the frequency bands based on the SNR computed for the frequency band; determining, with the one or more processors, a weighted GCC-PHAT value for each of the frequency bands based on the GCC-PHAT for the respective frequency band and the frequency band weight for the respective frequency band; up-sampling, with the one or more processors, the weighted GCC-PHAT value for each of the frequency bands by inserting zeroes in a spectral representation of the weighted GCC-PHAT value for each of the frequency bands; converting, with the one or more processors, the up-sampled weighted GCC-PHAT value for each of the frequency bands into a time domain; obtaining, with the one or more processors, an estimated time delay of arrival (TDOA) objective function based on the time domain up-sampled weighted GCC-PHAT value for each of the frequency bands; applying, with the one or more processors, an adaptive inter-frame filter to the TDOA objective function to obtain a filtered TDOA objective function; computing, with the one or more processors, an estimated TDOA based on the filtered TDOA objective function; and converting, with the one or more processors, the estimated TDOA to an angle representing sound direction, wherein coefficients of the adaptive inter-frame filter are respective signal powers of a plurality of processing frames preceding the processing frame. 3. A method comprising: receiving, with one or more processors, audio signals from two audio capture devices; converting, with the one or more processors, the audio signals into a processing frame by: splitting the audio signals into overlapping blocks; applying a windowing function to the overlapping blocks; and storing the offset between the windowed blocks as the processing frame; dividing the audio signals, in the processing frame, into multiple specified continuous frequency bands, each of the frequency bands including multiple frequency components; computing, with the one or more processors, a generalized cross-correlation with phase transform (GCC-PHAT) and a signal-to-noise ratio (SNR) for each of the frequency bands; determining, with the one or more processors, a frequency band weight for each of the frequency bands based on the SNR computed for the frequency band; determining, with the one or more processors, a weighted GCC-PHAT value for each of the frequency bands based on the GCC-PHAT for the respective frequency band and the frequency band weight for the respective frequency band; up-sampling, with the one or more processors, the weighted GCC-PHAT value for each of the frequency bands by inserting zeroes in a spectral representation of the weighted GCC-PHAT value for each of the frequency bands; converting, with the one or more processors, the up-sampled weighted GCC-PHAT value for each of the frequency bands into a time domain; determining, with the one or more processors, a time delay of arrival TDOA objective function for the processing frame of the audio signals based on the time domain up-sampled weighted GCC-PHAT value for each of the frequency bands; applying, with the one or more processors, an adaptive inter-frame filter to the TDOA objective function to obtain a filtered TDOA objective function, wherein coefficients of the adaptive inter-frame filter are respective signal powers of a plurality of processing frames preceding the processing frame; computing, with the one or more processors, an estimated TDOA based on the filtered TDOA objective function; and converting, with the one or more processors, the estimated TDOA to an angle representing sound direction. 4. A digital system comprising: two audio capture devices for capturing audio signals; means for converting, with the one or more processors, the audio signals into a processing frame by: splitting the audio signals into overlapping blocks; applying a windowing function to the overlapping blocks; and storing the offset between the windowed blocks as the processing frame; means for dividing the audio signals, in the processing frame, into multiple specified continuous frequency bands, each of the frequency bands including multiple frequency components; means for computing a generalized cross-correlation with phase transform (GCC-PHAT) and a signal-to-noise ratio (SNR) for each of the frequency bands; means for determining a frequency band weight for each of the frequency bands based on the SNR computed for the frequency band; means for determining a weighted GCC-PHAT value for each of the frequency bands based on the GCC-PHAT for the respective frequency band and the frequency band weight for the respective frequency band;

Assignees

Texas Instruments Inc

Inventors

Classifications

H04R3/005Primary
for combining the signals of two or more microphones (specially adapted for hearing aids H04R25/407) · CPC title
H04M3/568
audio processing specific to telephonic conferencing, e.g. spatial distribution, mixing of participants (echo suppression in two-way loud-speaking telephone systems H04M9/02; sound field processing per se H04S7/30) · CPC title
H04N7/15
Conference systems · CPC title
H04M2242/30
Determination of the location of a subscriber · CPC title
H04R2430/03
Synergistic effects of band splitting and sub-band processing · CPC title

Patent family

Related publications grouped by family.

View patent family 51388187

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10939201B2 cover?: A method for sound source localization in a digital system having at least two audio capture devices is provided that includes receiving audio signals from the two audio capture devices, computing a signal-to-noise ratio (SNR) for each frequency band of a plurality of frequency bands in a processing frame of the audio signals, determining a frequency band weight for each frequency band of the p…
Who is the assignee on this patent?: Texas Instruments Inc
What technology area does this patent fall under?: Primary CPC classification H04R3/005. Mapped technology areas include Electricity.
When was this patent published?: Publication date Tue Mar 02 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).