Context-aware human generation in an image
US-11854203-B1 · Dec 26, 2023 · US
US2022301118A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2022301118-A1 |
| Application number | US-202017641700-A |
| Country | US |
| Kind code | A1 |
| Filing date | May 13, 2020 |
| Priority date | May 13, 2020 |
| Publication date | Sep 22, 2022 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method for replacing an object in an image. The method may include identifying a first object at a position within a first image, masking, based on the first image and the position of the first object, a target area to produce a masked image, generating, based on the masked image and an inpainting machine learning model, a second image different from the first image, the inpainting machine learning model being trained using a difference between the target area of training images and content of generated images at location corresponding to the target area of the training images, generating, based on the masked image and the second image, a third image, and adding, to the third image, a new object different from the first object.
Opening claim text (preview).
What is claimed is: 1 . A computer-implemented method for replacing an object in an image comprising: identifying a first object at a position within a first image; masking, based on the first image and the position of the first object, a target area to produce a masked image; generating, based on the masked image and an inpainting machine learning model, a second image different from the first image, the inpainting machine learning model being trained using a difference between the target area of training images and content of generated images at a location corresponding to the target area of the training images; generating, based on the masked image and the second image, a third image; and adding, to the third image, a new object different from the first object. 2 . The computer-implemented method of claim 1 , wherein the first image is a frame of a video. 3 . The computer-implemented method of claim 1 , wherein the inpainting machine learning model is trained using a loss function that represents a difference between the target area of training images and content of generated images at a location corresponding to the target area of the training images. 4 . The computer-implemented method of claim 1 , wherein generating, based on the masked image and the second image, a third image comprises: masking, based on the second image and the location corresponding to the target area of the first image, an inverse target area to produce an inverse masked image; and generating, based on the masked image and the inverse masked image, the third image. 5 . The computer-implemented method of claim 4 , wherein the inverse target area comprises an area of the second image that is outside of the location corresponding to the target area of the first image. 6 . The computer-implemented method of claim 4 , wherein masking, based on the second image and the location corresponding to the target area of the first image, an inverse target area to produce an inverse masked image comprises generating, from the second image, an inverse masked image that includes at least some content of the second image that is inside the target area and that does not include at least some content of the second image that is outside the target area. 7 . The computer-implemented method of claim 4 , wherein generating the third image comprises compositing the inverse masked image with the masked image. 8 . The computer-implemented method of claim 1 , further comprising extrapolating, based on the third image, a fourth image, wherein each of the first image, second image, third image, and fourth image is a frame of a video. 9 . A system comprising: one or more processors; and one or more memory elements including instructions that, when executed, cause the one or more processors to perform operations including: identifying a first object at a position within a first image; masking, based on the first image and the position of the first object, a target area to produce a masked image; generating, based on the masked image and an inpainting machine learning model, a second image different from the first image, the inpainting machine learning model being trained using a difference between the target area of training images and content of generated images at a location corresponding to the target area of the training images; generating, based on the masked image and the second image, a third image; and adding, to the third image, a new object different from the first object. 10 . The system of claim 9 , wherein the inpainting machine learning model is trained using a loss function that represents a difference between the target area of training images and content of generated images at a location corresponding to the target area of the training images. 11 . The system of claim 9 , wherein the first image is a frame of a video. 12 . The system of claim 9 , wherein generating, based on the masked image and the second image, a third image comprises: masking, based on the second image and the location corresponding to the target area of the first image, an inverse target area to produce an inverse masked image; and generating, based on the masked image and the inverse masked image, the third image. 13 . The system of claim 12 , wherein the inverse target area comprises an area of the second image that is outside of the location corresponding to the target area of the first image. 14 . The system of claim 12 , wherein masking, based on the second image and the location corresponding to the target area of the first image, an inverse target area to produce an inverse masked image comprises generating, from the second image, an inverse masked image that includes at least some content of the second image that is inside the target area and that does not include at least some content of the second image that is outside the target area. 15 . The system of claim 12 , wherein generating the third image comprises compositing the inverse masked image with the masked image. 16 . The system of claim 9 , the operations further comprising extrapolating, based on the third image, a fourth image, wherein each of the first image, second image, third image, and fourth image is a frame of a video. 17 . A non-transitory computer storage medium encoded with instructions that when executed by a distributed computing system cause the distributed computing system to perform operations comprising: identifying a first object at a position within a first image; masking, based on the first image and the position of the first object, a target area to produce a masked image; generating, based on the masked image and an inpainting machine learning model, a second image different from the first image, the inpainting machine learning model being trained using a difference between the target area of training images and content of generated images at a location corresponding to the target area of the training images; generating, based on the masked image and the second image, a third image; and adding, to the third image, a new object different from the first object. 18 . The non-transitory computer storage medium of claim 17 , wherein the inpainting machine learning model is trained using a loss function that represents a difference between the target area of training images and content of generated images at a location corresponding to the target area of the training images. 19 . The non-transitory computer storage medium of claim 17 , wherein the first image is a frame of a video. 20 . The non-transitory computer storage medium of claim 15 , wherein generating, based on the masked image and the second image, a third image comprises: masking, based on the second image and the location corresponding to the target area of the first image, an inverse target area to produce an inverse masked image; and generating, based on the masked image and the inverse masked image, the third image.
Creating or editing images; Combining images with text · CPC title
using local operators · CPC title
Training; Learning · CPC title
Video; Image sequence · CPC title
Artificial neural networks [ANN] · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.