Who is the assignee on this patent?

Samsung Electronics Co Ltd, Univ Korea Res & Bus Found

What technology area does this patent fall under?

Primary CPC classification G06V40/176. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Aug 30 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Method and device for detecting voice activity based on image information

US2018247651A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2018247651-A1
Application number	US-201515557378-A
Country	US
Kind code	A1
Filing date	Mar 19, 2015
Priority date	Mar 19, 2015
Publication date	Aug 30, 2018
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Provided is a method of detecting a voice section, including detecting from at least one image an area where lips exist, obtaining a feature value of movement of the lips in the detected area based on a difference between pixel values of pixels included in the detected area, and detecting the voice section from the at least one image based on the feature value.

First claim

Opening claim text (preview).

1 . A method of detecting a voice section, the method comprising: detecting from at least one image an area where lips exist; obtaining a feature value of movement of the lips in the detected area based on a difference between pixel values of pixels included in the detected area; and detecting the voice section from the at least one image based on the feature value. 2 . The method of claim 1 , wherein the obtaining of the feature value comprises: obtaining an average pixel value of a reference pixel and neighboring pixels of the reference pixel; and obtaining the feature value based on a difference between the average value and the reference and neighboring pixels. 3 . The method of claim 1 , wherein the obtaining of the feature value comprises: obtaining a variance value of each of the pixels included in the detected area based on a difference between a representative value of the each pixel and the neighboring pixels and a pixel value of the neighboring pixels; and obtaining the feature value of the at least one image based on the variance value of each pixel. 4 . The method of claim 1 , wherein the obtaining of the feature value comprises: obtaining a number of pixels corresponding to the difference being greater than a preset threshold value for the at least one image; and obtaining, as a feature value for the image, a difference in the obtained number of pixels between a previous image preceding the image and a next image following the image. 5 . The method of claim 1 , wherein the detecting of the voice section comprises: determining a point of the feature value as a start point of the voice section if the feature value becomes greater than a first threshold value; setting a count value to 0 if the feature value becomes less than the first threshold value; increasing the count value over time from a point at which the feature value is less than the first threshold value to a point at which the feature value exists between the first threshold value and a second threshold value; and determining as an end point of the voice section, a point at which the count value is greater than a preset gap. 6 . A method of detecting a voice section, the method comprising: detecting from at least one image an area where lips exist; obtaining movement information of each pixel included in the detected area; dividing the detected area in such a way that divided regions are symmetric to each other; obtaining for the at least one image, a feature value of movement of the lips in the detected area based on a difference between movement information of the divided regions; and detecting the voice section based on the feature value for the at least one image. 7 . The method of claim 6 , wherein the obtaining of the movement information comprises: obtaining the movement information of each pixel by using an optical flow method. 8 . A device for detecting a voice section, the device comprising: a receiving unit configured to receive at least one image comprising a user's face; and a control unit configured to detect from at least one image an area where lips exist, to obtain a feature value of movement of the lips in the detected area based on a difference between pixel values of pixels included in the detected area, to detect the voice section from the at least one image based on the feature value, and to perform voice recognition based on an audio signal corresponding to the detected voice section. 9 . The device of claim 8 , wherein the control unit obtains an average pixel value of a reference pixel and neighboring pixels of the reference pixel and obtains the feature value based on a difference between the average value and the reference and neighboring pixels. 10 . The device of claim 8 , wherein the control unit obtains a variance value of each of the pixels included in the detected area based on a difference between a representative value of the each pixel and the neighboring pixels and a pixel value of the neighboring pixels and obtains the feature value of the at least one image based on the variance value of each pixel. 11 . The device of claim 8 , wherein the control unit obtains a number of pixels corresponding to the difference being greater than a preset threshold value for the at least one image, and obtains, as a feature value for the image, a difference in the obtained number of pixels between a previous image preceding the image and a next image following the image. 12 . The device of claim 8 , wherein the control unit determines a point of the feature value as a start point of the voice section if the feature value becomes greater than a first threshold value, sets a count value to 0 if the feature value becomes less than the first threshold value, increases the count value over time from a point at which the feature value is less than the first threshold value to a point at which the feature value exists between the first threshold value and a second threshold value, and determines as an end point of the voice section, a point at which the count value is greater than a preset gap. 13 .- 14 . (canceled) 15 . A non-transitory computer-readable recording medium having recorded thereon a program for executing the method according to claim 1 .

Assignees

Inventors

Classifications

G06V40/176Primary
Dynamic expression · CPC title
G10L15/25Primary
using position of the lips, movement of the lips or face analysis · CPC title
G06V10/50
by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis · CPC title
G06V10/467
Encoded features or binary features, e.g. local binary patterns [LBP] · CPC title
G06T7/20
Analysis of motion (motion estimation for coding, decoding, compressing or decompressing digital video signals H04N19/43, H04N19/51) · CPC title

Patent family

Related publications grouped by family.

View patent family 56918907

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2018247651A1 cover?: Provided is a method of detecting a voice section, including detecting from at least one image an area where lips exist, obtaining a feature value of movement of the lips in the detected area based on a difference between pixel values of pixels included in the detected area, and detecting the voice section from the at least one image based on the feature value.
Who is the assignee on this patent?: Samsung Electronics Co Ltd, Univ Korea Res & Bus Found
What technology area does this patent fall under?: Primary CPC classification G06V40/176. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Aug 30 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).