System for Beauty, Cosmetic, and Fashion Analysis
US-2017076474-A1 · Mar 16, 2017 · US
US9773196B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9773196-B2 |
| Application number | US-201615005855-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 25, 2016 |
| Priority date | Jan 25, 2016 |
| Publication date | Sep 26, 2017 |
| Grant date | Sep 26, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods are disclosed for segregating target individuals represented in a probe digital image from background pixels in the probe digital image. In particular, in one or more embodiments, the disclosed systems and methods train a neural network based on two or more of training position channels, training shape input channels, training color channels, or training object data. Moreover, in one or more embodiments, the disclosed systems and methods utilize the trained neural network to select a target individual in a probe digital image. Specifically, in one or more embodiments, the disclosed systems and methods generate position channels, training shape input channels, and color channels corresponding the probe digital image, and utilize the generated channels in conjunction with the trained neural network to select the target individual.
Opening claim text (preview).
We claim: 1. In a digital medium environment for editing digital visual media, a method of using deep learning to automatically select individuals portrayed in the digital visual media, the method comprising: training, by at least one processor, a neural network utilizing training input generated from a repository of digital training images; generating, by the at least one processor, with regard to a probe digital image portraying a target individual, a position channel that indicates positions of pixels in the probe digital image relative to the target individual portrayed in the probe digital image by determining a transform between one or more feature points of the target individual and a canonical pose; and identifying, by the at least one processor, a set of pixels representing the target individual in the probe digital image utilizing the trained neural network and the position channel. 2. The method of claim 1 , wherein: the training input comprises a plurality of training position channels, a plurality of training shape input channels, and a plurality of training color channels corresponding to the digital training images, wherein each training color channel in the plurality of training color channels reflect colors of pixels in a corresponding digital training image; the method further comprises generating a color channel that reflects colors of pixels in the digital training image; and the method identifies the set of pixels representing the target individual by utilizing the color channel. 3. The method of claim 2 , further comprising generating the training input by: identifying a training target individual portrayed in each digital training image; generating a training position channel for each digital training image that indicates positions of pixels in the digital training image relative to the identified training target individual portrayed in the training digital image; and generating a training shape input channel for each digital training image that comprises an estimated shape of the identified target individual portrayed in the digital training image. 4. The method of claim 1 , wherein generating the position channel further comprises: generating an x-position channel that indicates horizontal positions of pixels in the probe digital image relative to a face of the target individual portrayed in the probe digital image; and generating a y-position channel that indicates vertical positions of pixels in the probe digital image relative to the face of the target individual portrayed in the probe digital image. 5. The method of claim 1 , wherein generating the position channel further comprises: detecting one or more facial feature points corresponding to a face of the target individual portrayed in the probe digital image; estimating the transform between the detected one or more facial feature points and the canonical pose, wherein the canonical pose comprises template facial features; and applying the transform to the canonical pose to generate the position channel. 6. The method of claim 5 , wherein the position channel expresses the position of pixels in the canonical pose in a coordinate system that is centered on the face and scaled according to a size of the face. 7. The method of claim 1 , further comprising: generating a shape input channel that comprises an estimated shape of the target individual by: generating a mean digital object mask based on target individuals in a plurality of digital images, the mean digital object mask comprising a shape corresponding to the target individuals in the plurality of digital images; and utilizing the mean digital object mask to generate the shape input channel; and identifying the set of pixels representing the target individual in the probe digital image utilizing the trained neural network, the position channel, and the shape input channel. 8. The method of claim 7 , wherein generating the mean digital object mask comprises: identifying a set of pixels representing a target individual in a digital image from the plurality of digital images; identifying one or more facial feature points corresponding to the target individual in the digital image; estimating a first transform between the facial feature points and a canonical pose; and applying the first transform to the set of pixels representing the target individual in the digital image. 9. The method of claim 8 , wherein utilizing the mean digital object mask to generate the shape input channel comprises: detecting one or more facial feature points corresponding to the target individual portrayed in the probe digital image; estimating a second transform based on the detected one or more facial feature points corresponding to the target individual portrayed in the probe digital image and the canonical pose; and applying the second transform to the mean digital object mask to generate the shape input channel. 10. The method of claim 1 , further comprising modifying the probe digital image based on the set of pixels representing the target individual in the probe digital image by applying one or more of: a first image filter to the set of pixels representing the target individual in the probe digital image; or a second image filter to other pixels in the probe digital image. 11. The method of claim 1 , wherein identifying, by the at least one processor, the set of pixels representing the target individual in the probe digital image further comprises: upon receiving a request to select the target individual in the probe digital image, automatically identifying the set of pixels representing the target individual in the probe digital image without additional user input. 12. In a digital medium environment for editing digital visual media, a method of using deep learning to automatically select individuals portrayed in the digital visual media, the method comprising: accessing: a trained neural network generated from a repository of digital training images, wherein each of the digital training images portrays a training target individual, and a mean digital object mask reflecting a shape based on each of the training target individuals portrayed in the digital training images; generating, with regard to a probe digital image by at least one processor and utilizing the mean digital object mask, a shape input channel comprising an estimated shape of a target individual based on the mean digital object mask by estimating a transform between one or more facial feature points corresponding to the target individual portrayed in the probe digital image and a canonical pose; and identifying, by the at least one processor, a set of pixels representing the target individual in the probe digital image utilizing the trained neural network and the generated shape input channel. 13. The method of claim 12 , wherein utilizing the mean digital object mask to generate the shape input channel comprises: detecting the one or more facial feature points corresponding to the target individual portrayed in the probe digital image; estimating the transform based on the detected one or more facial feature points corresponding to the target individual portrayed in the probe digital image and the canonical pose, wherein the canonical pose comprises template facial features; and applying the transform to the mean digital object mask to generate the shape input channel. 14. The method of claim 12 , further comprising generating a position channel, wherein the position channel indicates a position of pixels in the probe digital image relative to a face of the target individual portrayed in the probe dig
Combinations of networks · CPC title
Learning methods · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
Supervised learning · CPC title
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.