System and method for semantic segmentation of images
US-10679351-B2 · Jun 9, 2020 · US
US12056743B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12056743-B2 |
| Application number | US-202217897098-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 26, 2022 |
| Priority date | Feb 28, 2020 |
| Publication date | Aug 6, 2024 |
| Grant date | Aug 6, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The specification discloses image and data processing methods and apparatuses. The method includes: obtaining a source pose and texture information according to a source image; obtaining a first synthetic image according to the source image, a target pose, and the source pose; obtaining a residual map according to the texture information and the first synthetic image; and obtaining a second synthetic image according to the first synthetic image and the residual map. The specification resolves the technical problem of lacking a sense of reality in a synthetic image due to loss of texture details in feature extraction during character action transfer in the existing technologies.
Opening claim text (preview).
What is claimed is: 1. An image processing method, comprising: obtaining a source pose according to a source image; performing pose transfer according to the source pose and a target pose to obtain a content feature map; performing at least one round of convolution and upsampling based on the content feature map to obtain intermediate features; fusing the intermediate features to generate a feature map with a spatial size consistent with the source image; transforming the feature map into a first synthetic image; extracting texture information based on the source image and spatial information based on the content feature map; obtaining a residual map by performing texture enhancing on the texture information guided by the spatial information; and obtaining, according to the first synthetic image and the residual map, a second synthetic image for displaying. 2. The method of claim 1 , wherein the obtaining the source pose comprises: obtaining the source pose of the source image through pose estimation according to the source image. 3. The method of claim 1 , wherein the performing pose transfer according to the source pose and the target pose to obtain the content feature map comprises: encoding the source image to obtain a first branch of input; encoding the source pose and the target pose to obtain a second branch of input; and performing information fusion on the first branch of input and the second branch of input to obtain the content feature map. 4. The method of claim 1 , wherein the obtaining the residual map comprises: performing deep learning on the texture information; normalizing the texture information on which deep learning has been performed and the content feature map; and performing reconstruction, to obtain the residual map, wherein the residual map comprises contour features and surface texture details in the source image. 5. The method of claim 4 , wherein: the contour features comprise at least one of: a human face, an animal head, a body feature, or an appearance feature of an article; and the surface texture details comprise product surface texture details, wherein the product surface texture details comprise at least one of: clothing texture details, accessory texture details, or tool texture details. 6. The method of claim 1 , wherein the obtaining the second synthetic image according to the first synthetic image and the residual map comprises: performing superposition according to the first synthetic image and the residual map, to obtain the second synthetic image. 7. The method of claim 6 , wherein the performing superposition according to the first synthetic image and the residual map, to obtain the second synthetic image comprises: according to contour features and surface texture details of the source image in the residual map, filling the contour features and the surface texture details at corresponding positions in the first synthetic image to obtain the second synthetic image, wherein the second synthetic image has the contour features and the surface texture details in the source image. 8. The method of claim 1 , further comprising: prior to obtaining the source pose and texture information according to the source image, receiving the source image uploaded by a user; and subsequent to the obtaining the second synthetic image, generating an image set or video data according to the second synthetic image. 9. The method of claim 8 , wherein the image set or the video data is applicable to online fitting effect display or advertisement page display. 10. A system for image processing, comprising a processor and a non-transitory computer-readable storage medium storing instructions executable by the processor to cause the system to perform operations comprising: obtaining a source pose according to a source image; performing pose transfer according to the source pose and a target pose to obtain a content feature map; performing at least one round of convolution and upsampling based on the content feature map to obtain intermediate features; fusing the intermediate features to generate a feature map with a spatial size consistent with the source image; transforming the feature map into a first synthetic image; extracting texture information based on the source image and spatial information based on the content feature map; obtaining a residual map by performing texture enhancing on the texture information guided by the spatial information; and obtaining, according to the first synthetic image and the residual map, a second synthetic image for displaying. 11. The system of claim 10 , wherein the performing pose transfer according to the source pose and the target pose to obtain the content feature map comprises: encoding the source image to obtain a first branch of input; encoding the source pose and the target pose to obtain a second branch of input; and performing information fusion on the first branch of input and the second branch of input to obtain the content feature map. 12. The system of claim 10 , wherein the obtaining the residual map comprises: performing deep learning on the texture information; normalizing the texture information on which deep learning has been performed and the content feature map; and performing reconstruction, to obtain the residual map, wherein the residual map comprises contour features and surface texture details in the source image. 13. The system of claim 10 , wherein the obtaining the second synthetic image according to the first synthetic image and the residual map comprises: performing superposition according to the first synthetic image and the residual map, to obtain the second synthetic image. 14. A non-transitory computer-readable storage medium for image processing, configured with instructions executable by one or more processors to cause the one or more processors to perform operations comprising: obtaining a source pose according to a source image; performing pose transfer according to the source pose and a target pose to obtain a content feature map; performing at least one round of convolution and upsampling based on the content feature map to obtain intermediate features; fusing the intermediate features to generate a feature map with a spatial size consistent with the source image; transforming the feature map into a first synthetic image; extracting texture information based on the source image and spatial information based on the content feature map; obtaining a residual map by performing texture enhancing on the texture information guided by the spatial information; and obtaining, according to the first synthetic image and the residual map, a second synthetic image for displaying. 15. The non-transitory computer-readable storage medium of claim 14 , wherein the performing pose transfer according to the source pose and the target pose to obtain the content feature map comprises: encoding the source image to obtain a first branch of input; encoding the source pose and the target pose to obtain a second branch of input; and performing information fusion on the first branch of input and the second branch of input to obtain the content feature map. 16. The non-transitory computer-readable storage medium of claim 14 , wherein the obtaining the residual map comprises: performing deep learning on the texture information; normalizing the texture information on which deep learning has been performed and the content feature map; and performing reconstruction, to obtain the residual map, wherein the residual map comprises contour features and sur
Texturing; Colouring; Generation of textures or colours (retouching, inpainting or scratch removal G06T5/77) · CPC title
Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands · CPC title
using neural networks · CPC title
using classification, e.g. of video objects · CPC title
Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.