Recognizing combinations of body shape, pose, and clothing in three-dimensional input images

US2018181802A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2018181802-A1
Application numberUS-201615392597-A
CountryUS
Kind codeA1
Filing dateDec 28, 2016
Priority dateDec 28, 2016
Publication dateJun 28, 2018
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Certain embodiments involve recognizing combinations of body shape, pose, and clothing in three-dimensional input images. For example, synthetic training images are generated based on user inputs. These synthetic training images depict different training figures with respective combinations of a body pose, a body shape, and a clothing item. A machine learning algorithm is trained to recognize the pose-shape-clothing combinations in the synthetic training images and to generate feature descriptors describing the pose-shape-clothing combinations. The trained machine learning algorithm is outputted for use by an image manipulation application. In one example, an image manipulation application uses a feature descriptor, which is generated by the machine learning algorithm, to match an input figure in an input image to an example image based on a correspondence between a pose-shape-clothing combination of the input figure and a pose-shape-clothing combination of an example figure in the example image.

First claim

Opening claim text (preview).

1 . A method for training machine learning algorithm s to match input images to example images based on combinations of body shape, body pose, and clothing item, the method comprising: generating, by a processing device and based on user inputs, synthetic training images having known combinations of a training body pose, a training clothing item, and a training body shape; training, by the processing device, a machine learning algorithm to generate training feature vectors describing the known combinations of the training body pose, the training clothing item, and the training body shape, wherein training the machine learning algorithm comprises: selecting a set of the synthetic training images with known body pose variations and known body shape variations, generating input depth maps based on clothing depicted in the selected set of synthetic training images, accessing a neural network for encoding the input depth maps into the training feature vectors and decoding output depth maps from the training feature vectors, and iteratively adjusting the neural network such that differences between the input depth maps and the output depth maps are minimized, wherein adjusting the neural network modifies (i) a first set of dimensions representing body pose in the training feature vectors encoded by the adjusted neural network and (ii) a second set of dimensions representing body shape in the training feature vectors encoded by the adjusted neural network; and outputting, by the processing device, the trained machine learning algorithm for matching (i) an input image having an input body shape and input body pose to (ii) an example image having a known combination of example body shape and example body pose. 2 . The method of claim 1 , further comprising: accessing the input image from a computer-readable medium; computing a feature vector from the accessed input image, the feature vector having a first portion that describes the input body pose, a second portion that describes the input body shape, and a third portion that describes an input clothing item in the input image; generating a query from the feature vector; and selecting the example image from a database using the query. 3 . The method of claim 2 , wherein the input image comprises a three-dimensional scan of a person, wherein a figure in the input image having the input body pose, the input body shape, and the input clothing item corresponds to the person. 4 . The method of claim 2 , further comprising: accessing example data describing the example body shape and the example body pose of the example figure; identifying, based on the example data, a difference between at least one of: (i) the input body shape and the example body shape; or (ii) the input body pose and the example body pose; modifying, based on the identified difference, data describing at least one of the input body shape or the input body pose, wherein the modified data is usable for at least one of animating the input figure or modifying clothing of the input figure. 5 . The method of claim 1 , wherein generating the synthetic training images comprises, for each synthetic training image: accessing, with a graphical application, a deformable body model from a computer-readable medium; modifying, with the graphical application, a body pose of the deformable body model and a body shape of the deformable body model; accessing, with the graphical application, graphics data depicting clothing items; applying, with the graphical application, at least some of the graphics data depicting the clothing items to the deformable body model with the modified body pose and the modified body shape; and storing respective training data for the synthetic training image describing the deformable body model with the modified body pose, the modified body shape, and the applied graphics data, wherein the respective training data indicates the respective training combination of the training body pose, the training clothing item, and the training body shape for the synthetic training image. 6 . The method of claim 1 , wherein training the machine learning algorithm further comprises: providing the input depth maps to the machine learning algorithm; encoding, based on the neural network, the input depth maps as feature vectors; decoding the feature vectors into synthesized depth maps; modifying a structure of the neural network based on differences between the synthesized depth maps and the input depth maps. 7 . The method of claim 1 , wherein training the machine learning algorithm comprises: selecting viewpoint training images, shape training images, pose training images, and clothing training images, wherein: the viewpoint training images are a first subset of the synthetic training images and include (i) a fixed combination of pose, shape, and clothing and (ii) respective viewpoint variations, the shape training images are a second subset of the synthetic training images and include (i) a fixed combination of viewpoint, shape, and clothing and (ii) respective body shape variations, the pose training images are a third subset of the synthetic training images and include (i) a fixed combination of viewpoint, shape, and clothing and (ii) respective body pose variations, and the clothing training images are a fourth subset of the synthetic training images and include (i) a fixed combination of viewpoint, pose, and shape and (ii) respective clothing variations; modifying, based on depth maps of the viewpoint training images, a structure of the neural network such that the modified structure of the neural network encodes a viewpoint in a first feature vector portion; modifying, based on depth maps of the shape training images, the structure of the neural network such that the modified structure of the neural network encodes a body shape in a second feature vector portion; modifying, based on depth maps of the pose training images, the structure of the neural network such that the modified structure of the neural network encodes a body pose in a third feature vector portion; and modifying, based on surface normal maps of the clothing training images, the structure of the neural network such that the modified structure of the neural network encodes clothing in a fourth feature vector portion. 8 . A system comprising: a non-transitory computer-readable medium storing synthetic training images having known combinations of a training body pose, a training clothing item, and a training body shape; means for selecting a set of the synthetic training images with known body pose variations and known body shape variations; means for encoding, via a neural network, input depth maps generated from the selected set of synthetic training images into training feature vectors and decoding output depth maps from the training feature vectors; and means for iteratively adjusting the neural network such that differences between the input depth maps and the output depth maps are minimized, wherein adjusting the neural network modifies (i) a first set of dimensions representing body pose in the training feature vectors encoded by the adjusted neural network and (ii) a second set of dimensions representing body shape in the training feature vectors encoded by the adjusted neural network; and means for matching, with the iteratively adjusted neural network, an input image having an input body shape and input body pose to an example image having a known combination of example body shape and example body pose. 9 . The system of claim 8 , wherein the means for matching comprises means for: accessing the input image from the non-transitory computer-readable medium or another non-transitory computer-readable mediu

Assignees

Inventors

Classifications

  • G06T7/50Primary

    Depth or shape recovery · CPC title

  • using neural networks · CPC title

  • using classification, e.g. of video objects · CPC title

  • Static body considered as a whole, e.g. static pedestrian or occupant recognition · CPC title

  • Combinations of networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2018181802A1 cover?
Certain embodiments involve recognizing combinations of body shape, pose, and clothing in three-dimensional input images. For example, synthetic training images are generated based on user inputs. These synthetic training images depict different training figures with respective combinations of a body pose, a body shape, and a clothing item. A machine learning algorithm is trained to recognize t…
Who is the assignee on this patent?
Adobe Systems Inc
What technology area does this patent fall under?
Primary CPC classification G06T7/50. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jun 28 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).