Transitioning multiple microphones from a first mode to a second mode
US-9226069-B2 · Dec 29, 2015 · US
US9668077B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9668077-B2 |
| Application number | US-200813056709-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 26, 2008 |
| Priority date | Jul 31, 2008 |
| Publication date | May 30, 2017 |
| Grant date | May 30, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Disclosed herein is an apparatus. The apparatus includes a housing, electronic circuitry, and an audio-visual source tracking system. The electronic circuitry is in the housing. The audio-visual source tracking system includes a first video camera and an array of microphones. The first video camera and the array of microphones are attached to the housing. The audio-visual source tracking system is configured to receive video information from the first video camera. The audio-visual source tracking system is configured to capture audio information from the array of microphones at least partially in response to the video information. The audio-visual source tracking system might include a second video camera that is attached to the housing, wherein the first and second video cameras together estimate the beam orientation of the array of microphones.
Opening claim text (preview).
What is claimed is: 1. An apparatus comprising: a housing; a processor in the housing; and an audio-visual source tracking system connected to the processor, wherein the audio-visual source tracking system comprises a first video camera and an array of microphones, wherein the first video camera and the array of microphones are attached to the housing, wherein at least a portion of the first video camera and at least a portion of the array of microphones are mounted inside the housing, wherein the audio-visual source tracking system is configured to receive video information from the first video camera, and wherein the audio-visual source tracking system is configured to capture audio information from the array of microphones at least partially in response to the video information, wherein the audio-visual source tracking system is configured to adjust and direct the sensitivity of the array of microphones during an active audio/visual speech call at least partially in response to the video information, wherein the audio-visual source tracking system is configured to estimate a depth of the video information with the first camera by analyzing a face size in the video information, and wherein the apparatus is a multi-function portable electronic device; wherein the array of microphones are proximate the first video camera; wherein the audio-visual source tracking system is configured to monitor positions of detected faces in a video; and wherein the audio-visual source tracking system is further configured to detect an active speaker from the detected faces by successively adjusting and directing the sensitivity of the array of microphones towards the active speaker's face such that if an audio level exceeds a threshold, a corresponding face is considered to be the active speaker's face. 2. An apparatus as in claim 1 wherein the array of microphones are proximate the first video camera. 3. An apparatus as in claim 1 wherein the array of microphones comprises at least three microphones. 4. An apparatus as in claim 1 wherein the audio-visual source tracking system is configured to receive video information corresponding to a user of the apparatus from the first video camera. 5. An apparatus as in claim 1 wherein the apparatus comprises a mobile handset. 6. An apparatus as in claim 1 wherein the audio-visual source tracking system is configured to determine a reference point of a user speaking into the device, and wherein the audio-visual source tracking system is configured to adjust and direct the sensitivity of the array of microphones towards the reference point of the user. 7. An apparatus as in claim 6 wherein the audio-visual source tracking system is configured to adjust and direct the sensitivity of the array of microphones toward the user's mouth. 8. An apparatus as in claim 1 wherein a direction of the array of microphones is determined based, at least partially, on a first angle and a second angle, wherein the first angle and the second angle correspond to a focal length of the first video camera. 9. An apparatus as in claim 1 wherein the audio-visual source tracking system is configured for selective enhancement of audio capturing sensitivity along a specific spatial direction towards a user's mouth. 10. An apparatus as in claim 1 wherein audio enhancement during silent portions of speech partials are configured to be provided by tracking a position of a user's face by directing a beam of the array of microphones towards the user. 11. An apparatus as in claim 1 wherein the audio-visual source tracking system is configured to monitor a position of all detected faces in a video stream and update the positions in a table. 12. An apparatus as in claim 1 wherein the depth of the video information comprises depth information, wherein the audio-visual source tracking system is configured to estimate a beam orientation of the array of microphones based, at least in part, on the depth information. 13. An apparatus as in claim 12 wherein the first video camera comprises a single 3D camera, and wherein the audio-visual source tracking system is configured obtain the depth information with only the single 3D camera. 14. A method comprising: capturing a first image with a camera of an audio-visual source tracking system of an apparatus; determining a direction of a portion of the first image with respect to an array of microphones of the apparatus; and controlling a predetermined characteristic of the array of microphones based at least partially on the direction of the portion of the first image, wherein the controlling of the predetermined characteristic of the array of microphones further comprises adjusting and directing the sensitivity of the array of microphones during an active audio/visual speech call at least partially in response to video information, wherein the audio-visual source tracking system is configured to estimate a depth of the video information with the first camera by analyzing a face size in the video information, wherein the apparatus is a multi-function portable electronic device, and wherein at least a portion of the camera and at least a portion of the array of microphones are mounted inside a housing of the multi-function portable electronic device; wherein the array of microphones are proximate the first video camera; wherein the audio-visual source tracking system is configured to monitor positions of detected faces in a video; and wherein the audio-visual source tracking system is further configured to detect an active speaker from the detected faces by successively adjusting and directing the sensitivity of the array of microphones towards the active speaker's face such that if an audio level exceeds a threshold, a corresponding face is considered to be the active speaker's face. 15. A method as in claim 14 wherein the determining of the direction of the portion of the first image further comprises detecting a face of a user of the apparatus in the first image. 16. A method as in claim 14 wherein the capturing of the first image further comprises capturing an image of a user of the apparatus, and wherein the determining of the direction of the portion of the image, further comprises determining a direction of a head of the user. 17. A method as in claim 14 wherein a direction of the array of microphones is determined based, at least partially, on a first angle and a second angle, wherein the first angle and the second angle correspond to a focal length of the first video camera. 18. A method as in claim 14 wherein the audio-visual source tracking system is configured for selective enhancement of audio capturing sensitivity along a specific spatial direction towards a user's mouth. 19. A method as in claim 14 wherein audio enhancement during silent portions of speech partials are configured to be provided by tracking a position of a user's face by directing a beam of the array of microphones towards the user. 20. A method as in claim 14 wherein the audio-visual source tracking system is configured to monitor a position of all detected faces in a video stream and update the positions in a table.
including functional features of a camera · CPC title
Stereoscopic video; Stereoscopic image sequence · CPC title
Public address systems (circuits for preventing acoustic reaction H04R3/02; circuits for distributing signals to loudspeakers H04R3/12; {monitoring or testing arrangements for public address systems H04R29/007}; amplifiers H03F) · CPC title
Microphone arrays · CPC title
Constructional details of the terminal equipment, e.g. arrangements of the camera and the display · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.