What technology area does this patent fall under?

Primary CPC classification H04N21/42203. Mapped technology areas include Electricity.

When was this patent published?

Publication date Tue Jul 19 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Audio volume control device, control method and program

US9398247B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9398247-B2
Application number	US-201214127772-A
Country	US
Kind code	B2
Filing date	Jul 19, 2012
Priority date	Jul 26, 2011
Publication date	Jul 19, 2016
Grant date	Jul 19, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An information processing apparatus includes a processor that receives captured image data and captured sound data corresponding to an environment in which content is reproduced and detects a user based on the captured image data and analyzes a situation of the environment based on a result of the detection and the captured sound data and controls an audio volume corresponding to reproduced content based on a result of the analyzing.

First claim

Opening claim text (preview).

The invention claimed is: 1. An information processing apparatus comprising: an input circuit for reception of capture image data and captured sound data corresponding to an environment in which content is reproduced; a processor that: processes the captured image data and the captured sound data corresponding to the environment in which content is reproduced; detects a user based on the captured image data; analyzes a situation of the environment based on a result of the detection and the captured sound data; determines a direction in the captured image data to a source of the captured sound data; determines if the direction in the captured image data to the source of the captured sound data is coincident with a location of a face of a human detected in the captured image data; and controls an audio volume corresponding to reproduced content based on a result of the analyzing, wherein when a sound level corresponding to the captured sound data is greater than or equal to a predetermined threshold value, the processor controls the audio volume corresponding to the reproduced content to remain unchanged when it is determined that the direction in the captured image data corresponding to the source of the captured sound data which is a human voice is coincident with the location of the face detected in the captured image data, and the processor controls the audio volume corresponding to the reproduced content to increase when the processor determines that the direction in the captured image data corresponding to the source of the captured sound data which is a human voice is not coincident with the location of the face detected in the captured image data, when the processor increases the audio volume when the processor determined that the direction in the captured image data corresponding to the source of the captured sound data which is a human voice is not coincident with the location of the face detected in the captured image data, the processor determines a volume increase amount based on the captured image data of a distance between the location of the detected user and the source of the captured sound data, and in an event of a manual adjustment of a setting, the processor once an environmental situation is over automatically returns to a previous setting before the environmental situation occurred. 2. The information processing apparatus of claim 1 , wherein the processor receives the captured image data from a camera positioned in the environment in which content is reproduced and detects the face based on the captured image data. 3. The information processing apparatus of claim 2 , wherein the processor detects a position corresponding to the detected face based on the captured image data. 4. The information processing apparatus of claim 2 , wherein the processor detects a plurality of faces based on the captured image data. 5. The information processing apparatus of claim 2 wherein the processor determines face information corresponding to the detected face, the face information including at least one of an individual, age and gender. 6. The information processing apparatus of claim 1 , wherein the processor receives the sound data from a microphone positioned in the environment in which content is reproduced. 7. The information processing apparatus of claim 1 , wherein the processor determines a sound level corresponding to the captured sound data. 8. The information processing apparatus of claim 1 , wherein the processor determines whether the captured sound data is a human's voice or a sound other than a human's voice. 9. The information processing apparatus of claim 1 , wherein the processor controls the audio volume corresponding to the reproduced content to remain unchanged when it is determined that the level is less than the predetermined threshold value. 10. The information processing apparatus of claim 1 , wherein the processor determines whether the captured sound data is a human's voice or a sound other than a human's voice when it is determined that the level is greater than the predetermined threshold value. 11. The information processing apparatus of claim 10 , wherein the processor controls the audio volume corresponding to the reproduced content to be lowered when it is determined that the captured sound data is a human's voice and a face is not detected based on the captured image data. 12. The information processing apparatus of claim 10 , wherein the processor determines a direction corresponding to a source of the captured sound data when it is determined that the captured sound data is a human's voice and a face is detected based on the captured image data. 13. The information processing apparatus of claim 10 , wherein the processor determines whether the captured sound data corresponds to an environmental sound registered in advance when it is determined that the captured sound data is determined to be a sound other than a human's voice. 14. The information processing apparatus of claim 13 , wherein the processor controls the audio volume corresponding to the reproduced content to increase when it is determined that the captured sound data corresponds to an environmental sound that is registered in advance. 15. The information processing apparatus of claim 13 , wherein the processor controls the audio volume corresponding to the reproduced content based on previously stored settings corresponding to the environmental sound when it is determined that the captured sound data corresponds to the environmental sound stored in advance. 16. The information processing apparatus of claim 1 , wherein the processor determines an age of the detected user and, when the processor controls the audio volume to increase, the processor applies an increased gain to a predetermined audio frequency band. 17. A method performed by an information processing apparatus, the method comprising: receiving captured image data and captured sound data corresponding to an environment in which content is reproduced; detecting a user based on the captured image data; determining a direction in the captured image data to the source of the captured sound data; determining if the direction in the captured image data to the source of the captured sound data is coincident with a location of a face of a human detected in the captured image data; analyzing a situation of the environmental based on a result of the detection and the captured sound data; and controlling an audio volume corresponding to reproduced content based on a result of the analyzing, wherein when a sound level corresponding to the captured sound data is greater than or equal to a predetermined threshold value, the controlling includes controlling the audio volume corresponding to the reproduced content to remain unchanged when it is determined that the direction in the captured image data corresponding to the source of the captured sound data which is at human voice is coincident with the location of the face detected in the captured image data, and the controlling includes controlling the audio volume corresponding to the reproduced content to increase when the direction in the captured image data corresponding to the source of the captured sound data which is human voice is not coincident with the location of the face detected in the captured image data, when the controlling includes controlling the audio volume to increase when the direction in the captured image date corresponding to the source of the captured sound data which is a human voice is not co

Assignees

Inventors

Tateishi Kazuya

Classifications

H04N21/4396
by muting the audio signal · CPC title
H04N21/4394
involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams (arrangements characterised by components specially adapted for monitoring, identification or recognition of audio in broadcast systems H04H60/58) · CPC title
H04N21/44008
involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream (arrangements characterised by components specially adapted for monitoring, identification or recognition of video in broadcast systems H04H60/59) · CPC title
H04N21/4532
involving end-user characteristics, e.g. viewer profile, preferences (monitoring of user activities for profile generation for accessing a video database G06F16/739; user profiles in network data switching protocols H04L67/306; processing of user preferences or user profiles in wireless networks H04W8/18) · CPC title
H04N21/4223
Cameras (H04N23/00 takes precedence) · CPC title

Patent family

Related publications grouped by family.

View patent family 47600761

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9398247B2 cover?: An information processing apparatus includes a processor that receives captured image data and captured sound data corresponding to an environment in which content is reproduced and detects a user based on the captured image data and analyzes a situation of the environment based on a result of the detection and the captured sound data and controls an audio volume corresponding to reproduced con…
Who is the assignee on this patent?: Tateishi Kazuya, Sony Corp
What technology area does this patent fall under?: Primary CPC classification H04N21/42203. Mapped technology areas include Electricity.
When was this patent published?: Publication date Tue Jul 19 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).