Listen to people you recognize

US2016157013A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2016157013-A1
Application numberUS-201615014952-A
CountryUS
Kind codeA1
Filing dateFeb 3, 2016
Priority dateFeb 26, 2014
Publication dateJun 2, 2016
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems, devices, and methods are described for recognizing and focusing on at least one source of an audio communication as part of a communication including a video image and an audio communication derived from two or more microphones when a relative position between the microphones is known. In certain embodiments, linked audio and video focus areas providing location information for one or more sound sources may each be associated with different user inputs, and an input to adjust a focus in either the audio or video domain may automatically adjust the focus in the another domain.

First claim

Opening claim text (preview).

What is claimed is: 1 . A system for visual and audio identification of a sound source comprising: a far-side device comprising: a far-side processor; at least two separate microphones, wherein the at least two separate microphones are coupled to the far-side processor; and a memory coupled to the far-side processor, wherein the memory comprises far-side instructions that, when executed by the far-side processor, cause the far-side processor to: capture a far-side video image and a far-side audio communication, wherein the far-side audio communication comprises at least two raw electronic audio signals created from at least two separate microphones integrated as part of the far-side device, and wherein a relative position of the at least two separate microphones is known; communicate the far-side video image and the far-side audio communication from the far-side device to a near-side device via a network; a near-side device comprising: a near-side processor; and a near-side memory coupled to the processor, wherein the near-side memory comprises near-side instructions that, when executed by the near-side processor, cause the near-side processor to: process the far-side video image and the far-side audio communication to identify at least one source of the far-side audio communication as part of a visual identification of the at least one source of the far-side audio communication; determine, based on the identifying of the at least one source of the far-side audio communication, at least one angle from the far-side device to the at least one source of the far-side audio communication; process the at least two raw electronic audio signals to (a) filter sounds received from outside the at least one angle from the far-side device to the at least one source of the far-side audio communication and/or (b) to emphasize sounds received from the at least one angle from the far-side device to the at least one source of the far-side audio communication; and create an output comprising (1) first far-side location information associated with the visual identification of the at least one source of the far-side audio communication overlaid on the far-side video image and (2) second far-side location information comprising the at least one angle from the far-side device to the at least one source of the far-side audio communication. 2 . The system of claim 1 wherein the far-side instructions are further configured to cause the far-side processor to determine the at least one angle from the far-side device to the at least one source of the far-side audio communication, and wherein the at least one angle from the far-side device to the at least one source of the far-side audio communication is communicated from the far-side device to the near-side device with the far-side video image and the far-side audio communication. 3 . The system of claim 1 wherein the near-side instructions are further configured to cause the near-side processor to process the far-side video image and the far-side audio communication to identify at least one source of the far-side audio communication as part of a visual identification of the at least one source of the far-side audio communication after the near-side device receives the far-side video image and the far-side audio communication. 4 . The system of claim 1 wherein the near-side instructions are further configured to cause the near-side processor to receive the relative position of the at least two separate microphones with the far-side audio communication. 5 . The system of claim 4 , wherein the first far-side location information and the second far-side location information each comprise part of a user interface presented on a display output of the near-side device. 6 . The system of claim 5 , wherein the near-side instructions are further configured to cause the near-side processor to: receive a first near-side user input adjusting the first far-side location information using a first portion of the user interface associated with the first far-side location information. 7 . The system of claim 6 , wherein the near-side instructions are further configured to cause the near-side processor to: automatically adjust the second far-side location information and a second portion of the user interface associated with the second far-side location information in response to the adjusting the first portion of the user interface; determine an updated at least one angle from the far-side device to the at least one source of the far-side audio communication; and automatically adjust processing the at least two raw electronic audio signals based on the updated at least one angle from the far-side device to the at least one source of the far-side audio communication. 8 . The system of claim 1 , wherein the near-side instructions are further configured to cause the near-side processor to: capture a near-side video image and a near-side audio communication, wherein the near-side audio communication comprises an additional at least two raw electronic audio signals created from an additional at least two separate microphones integrated as part of the near-side device, and wherein a second relative position of the additional at least two separate microphones is known; process the near-side video image and the near-side audio communication to identify at least one source of the near-side audio communication as part of a visual identification of the at least one source of the near-side audio communication; determine, based on the identifying of the at least one source of the near-side audio communication, the at least one angle from the near-side device to the at least one source of the near-side audio communication; and create a second output for the near-side device comprising (1) first near-side location information associated with the visual identification of the at least one source of the near-side audio communication overlaid on the near-side video image and (2) second near-side location information comprising the at least one angle from the near-side device to the at least one source of the near-side audio communication. 9 . The system of claim 8 , wherein the near-side instructions are further configured to cause the near-side processor to: display the first near-side location information, the second near-side location information, the first far-side location information, and the second far-side location information in a display output of the near-side device as part of a user interface of the near-side device, wherein the at least one source of the far-side audio communication comprises a user of the far-side device and wherein the at least one source of the near-side audio communication comprises a user of the near-side device. 10 . The system of claim 1 , wherein the far-side instructions are further configured to cause the far-side processor to: process the at least two raw electronic audio signals prior to communicating the far-side audio communication from the far-side device to the near-side device; receive a first far-side user input adjusting the first far-side location information using a first portion of a user interface associated with the first far-side location information; and adjust the processing of the at least two raw electronic audio signals based on the first far-side user input. 11 . A system for visual and audio identification of a sound source comprising: a far-side device comprising: means for capturing a far-side video image and a far-side audio communication, wherein the far-side audio communication comprises at least two raw electronic audio signals created from at least two separate microphones integrated as

Assignees

Inventors

Classifications

  • Noise filtering · CPC title

  • H04R3/005Primary

    for combining the signals of two or more microphones (specially adapted for hearing aids H04R25/407) · CPC title

  • microphones · CPC title

  • using position of the lips, movement of the lips or face analysis · CPC title

  • Microphone arrays; Beamforming · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016157013A1 cover?
Systems, devices, and methods are described for recognizing and focusing on at least one source of an audio communication as part of a communication including a video image and an audio communication derived from two or more microphones when a relative position between the microphones is known. In certain embodiments, linked audio and video focus areas providing location information for one or …
Who is the assignee on this patent?
Qualcomm Inc
What technology area does this patent fall under?
Primary CPC classification H04R3/005. Mapped technology areas include Electricity.
When was this patent published?
Publication date Thu Jun 02 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).