Data processing apparatus, imaging apparatus and data processing method
US-2017295355-A1 · Oct 12, 2017 · US
US9965865B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-9965865-B1 |
| Application number | US-201715473334-A |
| Country | US |
| Kind code | B1 |
| Filing date | Mar 29, 2017 |
| Priority date | Mar 29, 2017 |
| Publication date | May 8, 2018 |
| Grant date | May 8, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Devices and techniques are generally described for segmentation of image data using depth data. In various examples, color image data may be received from a digital camera. In some examples, depth image data may be received from a depth sensor. In various examples, the depth image data may be separated into a plurality of clusters of depth image data, wherein each cluster is associated with a respective range of depth values. In some further examples, a determination may be made that a first cluster of image data corresponds to an object of interest, such as a human subject, in the image data. In various examples, pixels of the first cluster may be encoded with foreground indicator data. In some further examples, segmented image data may be generated. The segmented image data may comprise pixels encoded with the foreground indicator data.
Opening claim text (preview).
What is claimed is: 1. A method for segmenting human image data from background image data, the method comprising: generating color image data representing a human in an environment using a digital camera; generating depth data representing the human in the environment using a depth sensor; separating the depth data into a plurality of clusters of depth data, wherein each cluster of the plurality of clusters is associated with a respective range of depth values; determining a first cluster from the plurality of clusters, wherein the first cluster comprises depth data at least partially corresponding to the human; encoding pixels in the first cluster with foreground indicator data to classify the pixels in the first cluster as foreground; determining a first average three-dimensional position of the pixels of the first cluster; determining a second average three-dimensional position of pixels of a second cluster from the plurality of clusters; determining that the first average three-dimensional position corresponds more closely to the second average three-dimensional position relative to other clusters of the plurality of clusters; determining that the second cluster at least partially corresponds to the human; encoding the pixels of the second cluster with the foreground indicator data to classify the pixels of the second cluster as foreground; associating a first pixel of the depth data with a second pixel of the color image data; identifying a first portion of the color image data for which no corresponding depth information is available in the depth data; determining a first average color value of the first portion of the color image data; determining a second average color value of a second portion of pixels, wherein the second portion of pixels correspond to pixels of the first cluster in the depth data; determining that the first average color value corresponds more closely to the second average color value relative to other portions of the color image data; determining that the first portion of the color image data at least partially corresponds to the human; encoding pixels of the first portion of the color image data for which no depth information is available with the foreground indicator data to classify the pixels of the first portion as foreground; and generating a segmentation mask, wherein the segmentation mask comprises one or more first pixels classified as foreground and one or more second pixels classified as background. 2. The method of claim 1 , further comprising: identifying floor image data in the depth data using a RANSAC algorithm, wherein the floor image data represents a floor of the environment on which the human is standing; and separating the depth data, excluding pixels corresponding to the floor image data, into the plurality of clusters of depth data. 3. The method of claim 1 , further comprising: determining a first region of the color image data corresponding to a face of the human; determining a region of interest in the color image data, wherein the region of interest comprises a band of image data around the first region; identifying a third pixel of the color image within the region of interest, wherein no corresponding depth information is available for the third pixel in the depth data; determining a distance, in terms of a number of pixels, between the third pixel and a closest pixel encoded with foreground indicator data; determining a probability that the third pixel corresponds to hair of the human based on the number of pixels between the third pixel and the closest pixel and a weight parameter; and classifying the third pixel as foreground based on the probability being greater than 0.8 by encoding the third pixel with the foreground indicator data. 4. An image segmentation method comprising: receiving color image data; receiving depth image data; separating the depth image data into a plurality of clusters of depth image data, wherein each cluster is associated with a respective range of depth values; determining that a first cluster of depth image data corresponds to an object of interest; encoding pixels of the first cluster with foreground indicator data; associating a first pixel of the depth image data with a corresponding second pixel of the color image data; determining that a third pixel of the depth image data corresponds to the object of interest based at least in part on the color image data; encoding the third pixel of the depth image data with the foreground indicator data; and generating first image data, wherein the first image data comprises a first set of pixels of the color image data encoded with the foreground indicator data and a second set of pixels of the color image data encoded with background indicator data. 5. The method of claim 4 , further comprising: identifying a second cluster of the plurality of clusters, wherein the second cluster of depth image data has an average pixel depth value indicating an average distance between a portion of an environment represented by the second cluster and a depth sensor; determining that the average distance exceeds a threshold distance; and encoding pixels of the second cluster with the background indicator data. 6. The method of claim 4 , further comprising: detecting second image data in the color image data, wherein the second image data represents a face; and determining that the second image data in the color image data corresponds to a portion of pixels of the first cluster in the depth image data, wherein determining that the first cluster of depth image data corresponds to the object of interest is based at least in part on determining that the second image data in the color image data corresponds to the portion of the pixels of the first cluster. 7. The method of claim 4 , further comprising: determining a first average depth value of the pixels of the first cluster; identifying a second cluster of the plurality of clusters; determining a second average depth value of pixels of the second cluster; comparing the first average depth value to the second average depth value; and encoding the pixels of the second cluster with foreground indicator data based at least in part on a level of correspondence between the first average depth value and the second average depth value. 8. The method of claim 4 , further comprising: identifying a second cluster of the plurality of clusters, wherein no depth data is associated with pixels of the second cluster; separating the second cluster into one or more blocks of pixels; determining a feature value of a first block of the one or more blocks of pixels; comparing the feature value of the first block of the one or more blocks of pixels to a corresponding feature value of the pixels of the first cluster; and encoding pixels of the first block with the foreground indicator data based at least in part upon a level of correspondence between the feature value of the first block and the corresponding feature value of the pixels of the first cluster. 9. The method of claim 4 , further comprising: determining a first region of the color image data, wherein the color image data of the first region represents a face; determining a second region of interest in the color image data, wherein the second region of interest surrounds the first region; identifying a fourth pixel of the color image within the second region of interest, wherein no corresponding depth information is available for the fourth pixel in the depth data; determining a distance, in terms of a number of pixels, between the fourth pixel and a closest fifth pixel encoded with foreground indicator data; determining that the fourth pixel is a foreg
Organisation of the process, e.g. bagging or boosting · CPC title
using clustering, e.g. of similar faces in social networks · CPC title
using classification, e.g. of video objects · CPC title
Clustering techniques · CPC title
Classification techniques · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.