Location estimation of active speaker

US10219098B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10219098-B2
Application numberUS-201715707299-A
CountryUS
Kind codeB2
Filing dateSep 18, 2017
Priority dateMar 3, 2017
Publication dateFeb 26, 2019
Grant dateFeb 26, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system and method to perform an estimation of a location of an active speaker in real time includes designating a microphone of an array of microphones as a reference microphone. The method includes storing a relative transfer function (RTF) for each microphone of the array of microphones other than the reference microphone associated with each potential location among potential locations as a set of stored RTFs, and obtaining a voice sample of the active speaker and obtaining a speaker RTF for each microphone of the array of microphones other than the reference microphone. The method also includes performing an RTF projection of the speaker RTF for each microphone on the set of stored RTFs, and determining one of the potential locations as the location of the active speaker based on the performing the RTF projection.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of performing an estimation of a location of an active speaker in real time, the method comprising: designating any one microphone of an array of microphones as a reference microphone; storing a relative transfer function (RTF) for each microphone of the array of microphones other than the reference microphone associated with each potential location among potential locations as a set of stored RTFs; obtaining a voice sample of the active speaker and obtaining a speaker RTF for each microphone of the array of microphones other than the reference microphone; performing an RTF projection of the speaker RTF for each microphone on the set of stored RTFs; and Determining, using a processor, one of the potential locations as the location of the active speaker based on the performing the RTF projection, wherein the obtaining the speaker RTF for each microphone of the array of microphones other than the reference microphone includes computing, for each of the potential locations, a ratio of an acoustic transfer function of the voice sample at the microphone to an acoustic transfer function of the voice sample at the reference microphone. 2. The method according to claim 1 , wherein the obtaining the voice sample is performed in real time. 3. The method according to claim 1 , further comprising sampling a sound from each of the potential locations to obtain the set of stored RTFs. 4. The method according to claim 1 , further comprising obtaining the set of stored RTFs as the RTF for each microphone of the array of microphones other than the reference microphone based on computing, for each of the potential locations, a ratio of an acoustic transfer function from one potential location among the potential locations to the microphone to an acoustic transfer function from the one potential location among the potential locations to the reference microphone. 5. The method according to claim 1 , wherein the performing the RTF projection includes calculating a cosine distance between each speaker RTF and each RTF of the set of stored RTFs. 6. The method according to claim 5 , wherein the determining the location of the active speaker is based on the maximum of the cosine distances. 7. The method according to claim 1 , wherein the storing the set of stored RTFs for the potential locations includes storing the set of stored RTFs for each seat in an automobile. 8. The method according to claim 7 , wherein the storing the set of stored RTFs is part of a calibration process performed for the automobile. 9. The method according to claim 7 , wherein the storing the set of stored RTFs is part of a calibration process performed for a calibration automobile of a same model as the automobile. 10. A system to estimate a location of an active speaker, the system comprising: a memory device configured to store a relative transfer function (RTF) for each microphone of an array of microphones other than a reference microphone associated with each potential location among potential locations as a set of stored RTFs, wherein the reference microphone is any one of the array of microphones; and a processor configured to obtain a voice sample of the active speaker and obtain a speaker RTF for each microphone of the array of microphones other than the reference microphone, perform an RTF projection of the speaker RTF for each microphone on the set of stored RTFs, and determine one of the potential locations as the location of the active speaker based on the RTF projection, wherein the processor obtains the speaker RTF for each microphone of the array of microphones other than the reference microphone based on computing, for each of the potential locations, a ratio of an acoustic transfer function of the voice sample at the microphone to an acoustic transfer function of the voice sample at the reference microphone. 11. The system according to claim 10 , wherein the processor obtains the voice sample in real time. 12. The system according to claim 10 , wherein the processor samples a sound from each of the potential locations to obtain the set of stored RTFs. 13. The system according to claim 10 , wherein the processor obtains the set of stored RTFs as the RTF for each microphone of the array of microphones other than the reference microphone based on computing, for each of the potential locations, a ratio of an acoustic transfer function from one potential location among the potential locations to the microphone to an acoustic transfer function from the one potential location among the potential locations to the reference microphone. 14. The system according to claim 10 , wherein the processor performs the RTF projection by calculating a cosine distance between each speaker RTF and each RTF of the set of stored RTFs. 15. The system according to claim 14 , wherein the processor determines the location of the active speaker based on the maximum of the cosine distances. 16. The system according to claim 10 , wherein the memory device stores the set of stored RTFs for each seat in an automobile. 17. The system according to claim 16 , wherein the memory device stores the set of stored RTFs as part of a calibration process performed for the automobile. 18. The system according to claim 16 , wherein the memory device stores the set of stored RTFs as part of a calibration process performed for a calibration automobile of a same model as the automobile.

Assignees

Inventors

Classifications

  • H04S7/305Primary

    Electronic adaptation of stereophonic audio signals to reverberation of the listening space (H04S7/301 takes precedence) · CPC title

  • H04R3/005Primary

    for combining the signals of two or more microphones (specially adapted for hearing aids H04R25/407) · CPC title

  • microphones · CPC title

  • 2D or 3D arrays of transducers · CPC title

  • G01S5/20Primary

    Position of source determined by a plurality of spaced direction-finders · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10219098B2 cover?
A system and method to perform an estimation of a location of an active speaker in real time includes designating a microphone of an array of microphones as a reference microphone. The method includes storing a relative transfer function (RTF) for each microphone of the array of microphones other than the reference microphone associated with each potential location among potential locations as …
Who is the assignee on this patent?
Gm Global Tech Operations Llc, Univ Bar Ilan
What technology area does this patent fall under?
Primary CPC classification H04S7/305. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Feb 26 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).