Method and device for detecting voice activity based on image information

US2018247651A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2018247651-A1
Application numberUS-201515557378-A
CountryUS
Kind codeA1
Filing dateMar 19, 2015
Priority dateMar 19, 2015
Publication dateAug 30, 2018
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Provided is a method of detecting a voice section, including detecting from at least one image an area where lips exist, obtaining a feature value of movement of the lips in the detected area based on a difference between pixel values of pixels included in the detected area, and detecting the voice section from the at least one image based on the feature value.

First claim

Opening claim text (preview).

1 . A method of detecting a voice section, the method comprising: detecting from at least one image an area where lips exist; obtaining a feature value of movement of the lips in the detected area based on a difference between pixel values of pixels included in the detected area; and detecting the voice section from the at least one image based on the feature value. 2 . The method of claim 1 , wherein the obtaining of the feature value comprises: obtaining an average pixel value of a reference pixel and neighboring pixels of the reference pixel; and obtaining the feature value based on a difference between the average value and the reference and neighboring pixels. 3 . The method of claim 1 , wherein the obtaining of the feature value comprises: obtaining a variance value of each of the pixels included in the detected area based on a difference between a representative value of the each pixel and the neighboring pixels and a pixel value of the neighboring pixels; and obtaining the feature value of the at least one image based on the variance value of each pixel. 4 . The method of claim 1 , wherein the obtaining of the feature value comprises: obtaining a number of pixels corresponding to the difference being greater than a preset threshold value for the at least one image; and obtaining, as a feature value for the image, a difference in the obtained number of pixels between a previous image preceding the image and a next image following the image. 5 . The method of claim 1 , wherein the detecting of the voice section comprises: determining a point of the feature value as a start point of the voice section if the feature value becomes greater than a first threshold value; setting a count value to 0 if the feature value becomes less than the first threshold value; increasing the count value over time from a point at which the feature value is less than the first threshold value to a point at which the feature value exists between the first threshold value and a second threshold value; and determining as an end point of the voice section, a point at which the count value is greater than a preset gap. 6 . A method of detecting a voice section, the method comprising: detecting from at least one image an area where lips exist; obtaining movement information of each pixel included in the detected area; dividing the detected area in such a way that divided regions are symmetric to each other; obtaining for the at least one image, a feature value of movement of the lips in the detected area based on a difference between movement information of the divided regions; and detecting the voice section based on the feature value for the at least one image. 7 . The method of claim 6 , wherein the obtaining of the movement information comprises: obtaining the movement information of each pixel by using an optical flow method. 8 . A device for detecting a voice section, the device comprising: a receiving unit configured to receive at least one image comprising a user's face; and a control unit configured to detect from at least one image an area where lips exist, to obtain a feature value of movement of the lips in the detected area based on a difference between pixel values of pixels included in the detected area, to detect the voice section from the at least one image based on the feature value, and to perform voice recognition based on an audio signal corresponding to the detected voice section. 9 . The device of claim 8 , wherein the control unit obtains an average pixel value of a reference pixel and neighboring pixels of the reference pixel and obtains the feature value based on a difference between the average value and the reference and neighboring pixels. 10 . The device of claim 8 , wherein the control unit obtains a variance value of each of the pixels included in the detected area based on a difference between a representative value of the each pixel and the neighboring pixels and a pixel value of the neighboring pixels and obtains the feature value of the at least one image based on the variance value of each pixel. 11 . The device of claim 8 , wherein the control unit obtains a number of pixels corresponding to the difference being greater than a preset threshold value for the at least one image, and obtains, as a feature value for the image, a difference in the obtained number of pixels between a previous image preceding the image and a next image following the image. 12 . The device of claim 8 , wherein the control unit determines a point of the feature value as a start point of the voice section if the feature value becomes greater than a first threshold value, sets a count value to 0 if the feature value becomes less than the first threshold value, increases the count value over time from a point at which the feature value is less than the first threshold value to a point at which the feature value exists between the first threshold value and a second threshold value, and determines as an end point of the voice section, a point at which the count value is greater than a preset gap. 13 .- 14 . (canceled) 15 . A non-transitory computer-readable recording medium having recorded thereon a program for executing the method according to claim 1 .

Assignees

Inventors

Classifications

  • G06V40/176Primary

    Dynamic expression · CPC title

  • G10L15/25Primary

    using position of the lips, movement of the lips or face analysis · CPC title

  • by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis · CPC title

  • Encoded features or binary features, e.g. local binary patterns [LBP] · CPC title

  • Analysis of motion (motion estimation for coding, decoding, compressing or decompressing digital video signals H04N19/43, H04N19/51) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2018247651A1 cover?
Provided is a method of detecting a voice section, including detecting from at least one image an area where lips exist, obtaining a feature value of movement of the lips in the detected area based on a difference between pixel values of pixels included in the detected area, and detecting the voice section from the at least one image based on the feature value.
Who is the assignee on this patent?
Samsung Electronics Co Ltd, Univ Korea Res & Bus Found
What technology area does this patent fall under?
Primary CPC classification G06V40/176. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Aug 30 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).