Catalog normalization and segmentation for fashion images

US11587271B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11587271-B2
Application numberUS-202017008964-A
CountryUS
Kind codeB2
Filing dateSep 1, 2020
Priority dateSep 1, 2020
Publication dateFeb 21, 2023
Grant dateFeb 21, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

First image data representing a first human wearing a first article of clothing may be received. The first image data, when rendered on a display, may include a first photometric artifact. A first generator network may be used to generate second image data from the first image data. The first photometric artifact may be removed from the second image data. A second generator network may be used to generate third image data from the second image data, the third image data representing the first human in a different pose relative to the first image data. Fourth image data representing the first article of clothing segmented from the first human may be generated and displayed on a display.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving first image data representing a first human wearing a first article of clothing, wherein the first image data, when rendered on a display, comprises a first photometric artifact; generating, using a first generator network, second image data from the first image data, wherein the first photometric artifact is removed from the second image data, when rendered on the display; generating, by a second generator network using the second image data, third image data representing the first human in a different pose relative to the first image data; generating, using the third image data, fourth image data representing the first article of clothing segmented from the first human; and generating code that, when executed by at least one processor, is effective to cause the fourth image data to be displayed on the display. 2. The method of claim 1 , further comprising training the first generator network using adversarial loss and perceptual loss to generate image data constrained by input images. 3. The method of claim 1 , further comprising: training the first generator network together with a discriminator network, wherein the discriminator network is trained using: a first pair of image data comprising a first natural image with a second photometric artifact and an edited version of the first natural image, edited to remove the second photometric artifact; and a second pair of image data comprising a second natural image with a third photometric artifact and a generated image generated by the first generator network to remove the third photometric artifact. 4. The method of claim 3 , wherein the first pair of image data further comprises first ground truth data indicating that the edited version of the first natural image is classified as “real,” and the second pair of image data further comprises second ground truth data indicating that the generated image is classified as “fake.” 5. The method of claim 1 , further comprising: generating, using a pre-trained network, a first activation map at a first layer of the pre-trained network, the first activation map being generated in response to inputting the first image data into the pre-trained network; generating, using the pre-trained network, a second activation map at the first layer of the pre-trained network, the second activation map generated in response to inputting the second image data into the pre-trained network; and comparing the first activation map and the second activation map to determine a perceptual loss associated with the second image data. 6. The method of claim 5 , further comprising: back-propagating the perceptual loss; and updating at least one parameter of the first generator network to minimize the perceptual loss. 7. The method of claim 1 , further comprising: determining, by a geometric transformation model, first data representing a first pose of the first human; inputting the first data into the second generator network; generating second pose data representing a symmetrically aligned pose of the first human; and generating the second image data with the first human in the different pose using the second pose data. 8. The method of claim 1 , wherein the first article of clothing, as represented in the fourth image data, is at least one of a different color, a different texture, or a different style relative to the first article of clothing as represented in the first image data. 9. The method of claim 1 , further comprising: determining a portion of the fourth image data corresponding to a portion of the first image data, wherein the portion of the first image data represents a portion of the first article of clothing that is occluded from view; and generating modified fourth image data by applying an image in-painting technique to the portion of the fourth image data. 10. A system comprising: at least one processor; and at least one non-transitory, computer-readable memory storing instructions that, when executed by the at least one processor, are effective to: receive first image data representing a first human wearing a first article of clothing, wherein the first image data, when rendered on a display, comprises a first photometric artifact; generate, using a first generator network, second image data from the first image data, wherein the first photometric artifact is removed from the second image data, when rendered on the display; generate, by a second generator network using the second image data, third image data representing the first human in a different pose relative to the first image data; generate, using the third image data, fourth image data representing the first article of clothing segmented from the first human; and generate code that, when executed by the at least one processor, is effective to cause the fourth image data to be displayed on the display. 11. The system of claim 10 , wherein the at least one non-transitory computer-readable memory stores further instructions that, when executed by the at least one processor, are further effective to train the first generator network using adversarial loss and perceptual loss to generate image data constrained by input images. 12. The system of claim 10 , wherein the at least one non-transitory computer-readable memory stores further instructions that, when executed by the at least one processor, are further effective to: train the first generator network together with a discriminator network, wherein the discriminator network is trained using: a first pair of image data comprising a first natural image with a second photometric artifact and an edited version of the first natural image, edited to remove the second photometric artifact; and a second pair of image data comprising a second natural image with a third photometric artifact and a generated image generated by the first generator network to remove the third photometric artifact. 13. The system of claim 12 , wherein the first pair of image data further comprises first ground truth data indicating that the edited version of the first natural image is classified as “real,” and the second pair of image data further comprises second ground truth data indicating that the generated image is classified as “fake.” 14. The system of claim 10 , wherein the at least one non-transitory computer-readable memory stores further instructions that, when executed by the at least one processor, are further effective to: generate, using a pre-trained network, a first activation map at a first layer of the pre-trained network, the first activation map being generated in response to inputting the first image data into the pre-trained network; generate, using the pre-trained network, a second activation map at the first layer of the pre-trained network, the second activation map generated in response to inputting the second image data into the pre-trained network; and compare the first activation map and the second activation map to determine a perceptual loss associated with the second image data. 15. The system of claim 14 , wherein the at least one non-transitory computer-readable memory stores further instructions that, when executed by the at least one processor, are further effective to: back propagate the perceptual loss; and update at least one parameter of the first generator network to minimize the perceptual loss. 16. The system of claim 10 , wherein the at least one non-transitory computer-readable memory stores further instructions that, when executed by the at least one processor, are further effective to:

Assignees

Inventors

Classifications

  • Texturing; Colouring; Generation of textures or colours (retouching, inpainting or scratch removal G06T5/77) · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Adversarial learning · CPC title

  • Supervised learning · CPC title

  • Generative networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11587271B2 cover?
First image data representing a first human wearing a first article of clothing may be received. The first image data, when rendered on a display, may include a first photometric artifact. A first generator network may be used to generate second image data from the first image data. The first photometric artifact may be removed from the second image data. A second generator network may be used …
Who is the assignee on this patent?
Amazon Tech Inc
What technology area does this patent fall under?
Primary CPC classification G06T11/60. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 21 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).