Auto-calibration of relative positions of multiple speaker tracking systems

US9986360B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-9986360-B1
Application numberUS-201715631327-A
CountryUS
Kind codeB1
Filing dateJun 23, 2017
Priority dateJun 23, 2017
Publication dateMay 29, 2018
Grant dateMay 29, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system that automatically calibrates multiple speaker tracking systems with respect to one another based on detection of an active speaker at a collaboration endpoint is presented herein. The system collects a first data point set of an active speaker at the collaboration endpoint using at least a first camera and a first microphone array. The system then receives a plurality of second data point sets from one or more secondary speaker tracking systems located at the collaboration endpoint. Once enough data points have been collected, a reference coordinate system is determined using the first data point set and the one or more second data point sets. Finally, after a reference coordinate system has been determined, the system generates the locations of the one or more secondary speaker tracking systems with respect to the first speaker tracking system.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: collecting a first data point set of an active speaker at a collaboration endpoint using at least a first camera and a first microphone array of a first speaker tracking system located at the collaboration endpoint; receiving a plurality of second data point sets from one or more secondary speaker tracking systems located at the collaboration endpoint, each secondary speaker tracking system including at least a secondary camera and a secondary microphone array; determining a reference coordinate system using the first data point set and one or more of the plurality of second data point sets; and generating locations with respect to the reference coordinate system of the one or more secondary speaker tracking systems. 2. The method of claim 1 , wherein collecting the first data point set is performed by the first speaker tracking system. 3. The method of claim 2 , wherein receiving a plurality of second data points, determining a reference coordinate system, and generating locations of the one or more secondary speaker tracking systems is performed by the first speaker tracking system. 4. The method of claim 2 , wherein receiving a plurality of second data points, determining a reference coordinate system, and generating locations of the one or more secondary speaker tracking systems is performed by a server coupled to the first speaker tracking system and the one or more secondary speaker tracking systems. 5. The method of claim 4 , further comprising: receiving, by the server, the first data point set of the active speaker at the collaboration endpoint from the first speaker tracking system. 6. The method of claim 1 , wherein generating the locations of the one or more secondary speaker tracking systems is based on the reference coordinate system, the first data point set, and the plurality of second data point sets. 7. The method of claim 1 , wherein the first data point set includes an indication of a distance of the active speaker from the first speaker tracking system and one or more angles of the active speaker with respect to a normal of the first speaker tracking system at a first point in time. 8. The method of claim 7 , wherein each of the plurality of second data point sets includes an indication of a distance of the active speaker from a respective secondary speaker tracking system and an angle of the active speaker with respect to a normal of the respective secondary speaker tracking system at the first point in time. 9. An apparatus comprising: a network interface unit configured to enable communications over a network; and a processor coupled to the network interface unit, the processor configured to: receive a first data point set associated with an active speaker detected at a collaboration endpoint with at least a first camera and a first microphone array of a first speaker tracking system located at the collaboration endpoint; receive a plurality of second data point sets from one or more secondary speaker tracking systems located at the collaboration endpoint, each secondary speaker tracking system including at least a secondary camera and a secondary microphone array; determine a reference coordinate system using the first data point set and one or more of the plurality of second data point sets; and generate locations with respect to the reference coordinate system of the one or more secondary speaker tracking systems. 10. The apparatus of claim 9 , wherein the processor, when receiving the first data point set, causes the first speaker tracking system to collect the first data point set. 11. The apparatus of claim 10 , wherein the processor, when receiving the plurality of second data points, determining a reference coordinate system, and generating locations of the one or more secondary speaker tracking systems, causes the first speaker tracking system to receive the plurality of second data point sets from the one or more secondary speaker tracking systems located at the collaboration endpoint, determine the reference coordinate system using the first data point set and the one or more second data point sets, and generate the locations of the one or more secondary speaker tracking systems with respect to the first speaker tracking system. 12. The apparatus of claim 9 , wherein the processor is further configured to: receive the first data point set of the active speaker at the collaboration endpoint from the first speaker tracking system. 13. The apparatus of claim 9 , wherein the processor is configured to generate the locations of the one or more secondary speaker tracking systems based on the reference coordinate system, the first data point set, and the plurality of second data point sets. 14. The apparatus of claim 9 , wherein the first data point set includes an indication of a distance of the active speaker from the first speaker tracking system and one or more angles of the active speaker with respect to a normal of the first speaker tracking system at a first point in time. 15. The apparatus of claim 14 , wherein each of the plurality of second data point sets includes an indication of a distance of the active speaker from a respective secondary speaker tracking system and an angle of the active speaker with respect to a normal of the respective secondary speaker tracking system at the first point in time. 16. One or more non-transitory computer readable storage media, the computer readable storage media being encoded with software comprising computer executable instructions, and when the software is executed, operable to: receive a first data point set associated with an active speaker detected at a collaboration endpoint with at least a first camera and a first microphone array of a first speaker tracking system located at the collaboration endpoint; receive a plurality of second data point sets from one or more secondary speaker tracking systems located at the collaboration endpoint, each secondary speaker tracking system including at least a secondary camera and a secondary microphone array; determine a reference coordinate system using the first data point set and one or more of the plurality of second data point sets; and generate locations with respect to the reference coordinate system of the one or more secondary speaker tracking systems. 17. The non-transitory computer readable storage media of claim 16 , wherein the instructions are further operable to: receive the first data point set of the active speaker at the collaboration endpoint from the first speaker tracking system. 18. The non-transitory computer readable storage media of claim 16 , wherein the instructions are configured to generate the locations of the one or more secondary speaker tracking systems based on the reference coordinate system, the first data point set, and the plurality of second data point sets. 19. The non-transitory computer readable storage media of claim 16 , wherein the first data point set includes an indication of a distance of the active speaker from the first speaker tracking system and one or more angles of the active speaker with respect to a normal of the first speaker tracking system at a first point in time. 20. The non-transitory computer readable storage media of claim 19 , wherein each of the plurality of second data point sets includes an indication of a distance of the active speaker from a respective secondary speaker tracking system and an angle of the active speaker with respect to a normal of the respective secondary s

Assignees

Inventors

Classifications

  • H04M3/569Primary

    using the instant speaker's algorithm (speech detection per se G10L25/78) · CPC title

  • Focus control based on electronic image sensor signals · CPC title

  • audio processing specific to telephonic conferencing, e.g. spatial distribution, mixing of participants (echo suppression in two-way loud-speaking telephone systems H04M9/02; sound field processing per se H04S7/30) · CPC title

  • H04S7/301Primary

    Automatic calibration of stereophonic sound system, e.g. with test microphone · CPC title

  • for loudspeakers (H04R29/007 takes precedence) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9986360B1 cover?
A system that automatically calibrates multiple speaker tracking systems with respect to one another based on detection of an active speaker at a collaboration endpoint is presented herein. The system collects a first data point set of an active speaker at the collaboration endpoint using at least a first camera and a first microphone array. The system then receives a plurality of second data p…
Who is the assignee on this patent?
Cisco Tech Inc
What technology area does this patent fall under?
Primary CPC classification H04M3/569. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue May 29 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).