Sound source separation for robot from target voice direction and noise voice direction

US10665249B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10665249-B2
Application numberUS-201815985360-A
CountryUS
Kind codeB2
Filing dateMay 21, 2018
Priority dateJun 23, 2017
Publication dateMay 26, 2020
Grant dateMay 26, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A voice input unit has predetermined directivity for acquiring a voice. A sound source arrival direction estimation unit operating as a first direction detection unit detects a first direction, which is an arrival direction of a signal voice of a predetermined target, from the acquired voice. Moreover, a sound source arrival direction estimation unit operating as a second direction detection unit detects a second direction, which is an arrival direction of a noise voice, from the acquired voice. A sound source separation unit, a sound volume calculation unit, and a detection unit having an S/N ratio calculation unit detect a sound source separation direction or a sound source separation position, based on the first direction and the second direction.

First claim

Opening claim text (preview).

What is claimed is: 1. A sound source separation information detecting device, comprising: a microphone array having predetermined directivity to acquire a voice; and a processor configured to: detect a first direction, which is an arrival direction of a signal voice of a predetermined target, from the voice acquired by the microphone array; detect a second direction, which is an arrival direction of a noise voice, from the voice acquired by the microphone array; detect a sound source separation direction or a sound source separation position, based on the first direction and the second direction; acquire a lips image of the predetermined target at a timing when the microphone array acquires the voice; determine opening of lips of the predetermined target or closing of the lips thereof, based on the acquired the lips image; consider the voice acquired by the microphone array at the determination of the opening of the lips as the signal voice; consider the voice acquired by the microphone array at the determination of the closing of the lips as the noise voice; acquire a face image; acquire a moving amount of the lips of the predetermined target from the lips image; acquire a rotation amount of a face of the predetermined target from the face image; and determine the opening of the lips of the predetermined target or the closing of the lips thereof, based on the moving amount of the lips and the rotation amount of the face. 2. The sound source separation information detecting device according to claim 1 , wherein, in a case where a signal-to-noise ratio calculated from the signal voice and the noise voice is equal to or lower than a threshold value, the processor detects the sound source separation direction or the sound source separation position in which the signal-to-noise ratio exceeds the threshold value, based on the first direction and the second direction. 3. The sound source separation information detecting device according to claim 2 , wherein the processor considers a direction in which the signal-to-noise ratio reaches the maximum exceeding the threshold value as the sound source separation direction or considers a position in which the signal-to-noise ratio reaches the maximum exceeding the threshold value as the sound source separation position. 4. The sound source separation information detecting device according to claim 2 , wherein the processor considers a current direction as the sound source separation direction or considers the current position as the sound source separation position in the case where the signal-to-noise ratio exceeds the threshold value. 5. The sound source separation information detecting device according to claim 1 , wherein the processor determines the opening of the lips or the closing of the lips in the case where the moving amount of the lips in an opening and closing direction out of the moving amount of the lips exceeds a first threshold value, the moving amount of the lips in a stretching direction out of the moving amount of the lips is less than a second threshold value, and the rotation amount of the face is less than a third threshold value. 6. The sound source separation information detecting device according to claim 1 , wherein the processor is further configured to: detect the first direction, based on signal voice power of the signal voice, at the determination of the opening of the lips; detect the second direction, based on noise voice power of the noise voice, at the determination of the closing of the lips. 7. The sound source separation information detecting device according to claim 1 , wherein the processor is further configured to: notify a message of the predetermined target, the message including a moving direction and a moving distance to the sound source separation position in order to cause the predetermined target to move from the current position to the sound source separation position. 8. The sound source separation information detecting device according to claim 1 , wherein the predetermined target is a human or an animal. 9. A robot, comprising: the sound source separation information detecting device according to claim 1 ; a moving unit configured to move its own device; and an operating unit configured to operate the its own device; wherein the processor is configured to control the sound source separation information detecting device, the moving unit, and the operating unit. 10. The robot according to claim 9 , wherein the processor controls the moving unit to cause the its own device to move to the sound source separation position. 11. The robot according to claim 10 , wherein the processor controls the operating unit so that the its own device moves to the sound source separation position while making eye contact with the predetermined target or looking toward the predetermined target. 12. The robot according to claim 10 , wherein the processor controls the moving unit and the operating unit so that the its own device moves to the sound source separation position by moving slightly or only rotating, instead of moving straightforwardly to the sound source separation position. 13. A sound source separation information detecting method, comprising: detecting a first direction, which is an arrival direction of a signal voice of a predetermined target, from a voice acquired by a microphone array having predetermined directivity to acquire the voice; detecting a second direction, which is an arrival direction of a noise voice, from the voice acquired by the microphone array; detecting a sound source separation direction or a sound source separation position, based on the first direction and the second direction; acquiring a lips image of the predetermined target at a timing when the microphone array acquires the voice; determining opening of lips of the predetermined target or closing of the lips thereof, based on the acquired the lips image; considering the voice acquired by the microphone array at the determination of the opening of the lips as the signal voice; considering the voice acquired by the microphone array at the determination of the closing of the lips as the noise voice; acquiring a face image; acquiring a moving amount of the lips of the predetermined target from the lips image; acquiring a rotation amount of a face of the predetermined target from the face image; and determining the opening of the lips of the predetermined target or the closing of the lips thereof, based on the moving amount of the lips and the rotation amount of the face. 14. A non-transitory computer-readable storage medium having stored thereon a program that is executable by a computer of a sound source separation information detecting device to control the computer to perform functions comprising: detecting a first direction, which is an arrival direction of a signal voice of a predetermined target, from a voice acquired by a microphone array having predetermined directivity to acquire the voice; detecting a second direction, which is an arrival direction of a noise voice, from the voice acquired by the microphone array; detecting a sound source separation direction or a sound source separation position, based on the first direction and the second direction; acquiring a lips image of the predetermined target at a timing when the microphone array acquires the voice; determining opening of lips of the predetermined target or closing of the lips thereof, based on the acquired the lips image; considering the voice acquired by the microphone array at the determination of the opening of the lips as the signal voice; considering the voice

Assignees

Inventors

Classifications

  • G01H17/00Primary

    Measuring mechanical vibrations or ultrasonic, sonic or infrasonic waves, not provided for in the other groups of this subclass · CPC title

  • Noise filtering · CPC title

  • Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech (G10L21/02 takes precedence) · CPC title

  • Sensing devices · CPC title

  • Accessories fitted to manipulators, e.g. for monitoring, for viewing; Safety devices combined with or specially adapted for use in connection with manipulators (safety-devices in general F16P; protection against radiation in general G21F) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10665249B2 cover?
A voice input unit has predetermined directivity for acquiring a voice. A sound source arrival direction estimation unit operating as a first direction detection unit detects a first direction, which is an arrival direction of a signal voice of a predetermined target, from the acquired voice. Moreover, a sound source arrival direction estimation unit operating as a second direction detection un…
Who is the assignee on this patent?
Casio Computer Co Ltd
What technology area does this patent fall under?
Primary CPC classification G01H17/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 26 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).