Information processing apparatus, information processing method, and program

US10332519B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10332519-B2
Application numberUS-201615529580-A
CountryUS
Kind codeB2
Filing dateMar 9, 2016
Priority dateApr 7, 2015
Publication dateJun 25, 2019
Grant dateJun 25, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An apparatus including circuitry configured to determine a position of a mouth of a user that is distinguishable among a plurality of people, and control an acquisition condition for collecting a sound based on the determined position of the user's mouth.

First claim

Opening claim text (preview).

The invention claimed is: 1. An apparatus, comprising: circuitry configured to control at least one imaging sensor to determine a position of a mouth of each user that is distinguishable among a plurality of people, determine a reliability of the determined position of each user's mouth based on information obtained by the at least one imaging sensor, control an acquisition condition for collecting a sound based on the determined position of each user's mouth and the determined reliability of the determined position of each user's mouth, and collect the sound using at least one sound sensor according to the controlled acquisition condition, wherein each sound sensor of the at least one sound sensor is located in a predetermined position, and wherein the acquisition condition comprises orientation and width of a sound collection region for each sound sensor of the at least one sound sensor. 2. The apparatus according to claim 1 , wherein the circuitry is further configured to: detect a body part of each user performing a gesture; and determine a relative position or a relative orientation of at least one portion of each user's body part at a plurality of points during the gesture, wherein the position of each user's mouth is determined as an estimate based on the determined relative position or the determined relative orientation of the at least one portion of each user's body part. 3. The apparatus according to claim 2 , wherein the detected body part comprises an arm of each user and the at least one portion of each user's body part comprises one or more of a hand, a forearm, an elbow, and a shoulder of the user. 4. The apparatus according to claim 3 , wherein the relative position or the relative orientation of the at least one portion of each user's body part is determined based on the relative position or the relative orientation of another one of the at least one portion of each user's body part. 5. The apparatus according to claim 2 , wherein the circuitry is further configured to determine whether the detected body part is on a left side or a right side of each user. 6. The apparatus according to claim 1 , wherein the determined position of each user's mouth is set to be a target position of sound collection, such that the orientation of the at least one sound collection region is directed toward each target position. 7. The apparatus according to claim 1 , wherein the circuitry is further configured to determine the position of the mouth of each user of a plurality of users distinguishable among the plurality of people. 8. The apparatus according to claim 7 , wherein the determined position of each mouth of the plurality of users is set to be a target position of sound collection, such that the orientation of each sound collection region is directed toward one of the plurality of target positions. 9. The apparatus according to claim 8 , wherein a number of sound sensors is equal to or greater than a number of the plurality of users. 10. The apparatus according to claim 8 , wherein each sound sensor collects sound within the orientation and the width of the sound collection region directed toward one of the plurality of target positions. 11. The apparatus according to claim 10 , wherein an estimate of the plurality of target positions is based on a determined relative position or a determined relative orientation of at least one portion of a body part of each user of the plurality of users. 12. The apparatus according to claim 11 , wherein the relative position or the relative orientation of the at least one portion of each user's body part is determined using the at least one imaging sensor at a plurality of points during a detected gesture of the user's body part. 13. The apparatus according to claim 12 , wherein the determined reliability of the determined position of each user's mouth is based on an amount of data for each target position related to the relative position or the relative orientation of the at least one portion of each user's body part, and the width of a particular sound collection region decreases as the reliability of the estimate of a particular target position of the plurality of target positions increases. 14. The apparatus according to claim 1 , wherein the circuitry is further configured to control a display device to display visual information indicating the control of the acquisition condition. 15. The apparatus according to claim 14 , wherein the displayed visual information indicating the control of the acquisition condition is based on the determined reliability of the determined position of each user's mouth. 16. The apparatus according to claim 14 , wherein a size of the displayed visual information is controlled according to the determined reliability of the determined position of each user's mouth. 17. The apparatus according to claim 1 , wherein each imaging sensor of the at least one imaging sensor is located in the predetermined position of a respective sound sensor of the at least one sound sensor. 18. An information processing method, performed via at least one processor, the method comprising: controlling at least one imaging sensor to determine a position of a mouth of each user that is distinguishable among a plurality of people; determining a reliability of the determined position of each user's mouth based on information obtained by the at least one imaging sensor; controlling an acquisition condition for collecting a sound based on the determined position of each user's mouth and the determined reliability of the determined position of each user's mouth; and collecting the sound using at least one sound sensor according to the controlled acquisition condition, wherein each sound sensor of the at least one sound sensor is located in a predetermined position, and wherein the acquisition condition comprises orientation and width of a sound collection region for each sound sensor of the at least one sound sensor. 19. A non-transitory computer-readable medium having embodied thereon a program, which when executed by a computer causes the computer to execute a method, the method comprising: controlling at least one imaging sensor to determine a position of a mouth of each user that is distinguishable among a plurality of people; determining a reliability of the determined position of each user's mouth based on information obtained by the at least one imaging sensor; controlling an acquisition condition for collecting a sound based on the determined position of each user's mouth and the determined reliability of the determined position of each user's mouth; and collecting the sound using at least one sound sensor according to the controlled acquisition condition, wherein each sound sensor of the at least one sound sensor is located in a predetermined position, and wherein the acquisition condition comprises orientation and width of a sound collection region for each sound sensor of the at least one sound sensor.

Assignees

Inventors

Classifications

  • Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

  • with detection of the device orientation or free movement in a three-dimensional [3D] space, e.g. 3D mice, 6-DOF [six degrees of freedom] pointers using gyroscopes, accelerometers or tilt-sensors · CPC title

  • Circuits for transducers (arrangements for producing a reverberation or echo sound G10K15/08; amplifiers H03F) · CPC title

  • G06F3/011Primary

    Arrangements for interaction with the human body, e.g. for user immersion in virtual reality (blind teaching G09B21/00) · CPC title

  • Sound input; Sound output (speech processing G10L) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10332519B2 cover?
An apparatus including circuitry configured to determine a position of a mouth of a user that is distinguishable among a plurality of people, and control an acquisition condition for collecting a sound based on the determined position of the user's mouth.
Who is the assignee on this patent?
Sony Corp
What technology area does this patent fall under?
Primary CPC classification G06F3/011. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 25 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).