System and method for image inpainting
US-11580622-B2 · Feb 14, 2023 · US
US12008797B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12008797-B2 |
| Application number | US-202117383181-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 22, 2021 |
| Priority date | Mar 1, 2019 |
| Publication date | Jun 11, 2024 |
| Grant date | Jun 11, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
This application discloses an image segmentation method in the field of artificial intelligence. The method includes: obtaining an input image and a processing requirement; performing multi-layer feature extraction on the input image to obtain a plurality of feature maps; downsampling the plurality of feature maps to obtain a plurality of feature maps with a reference resolution, where the reference resolution is less than a resolution of the input image; fusing the plurality of feature maps with the reference resolution to obtain at least one feature map group; upsampling the feature map group by using a transformation matrix W, to obtain a target feature map group; and performing target processing on the target feature map group based on the processing requirement to obtain a target image.
Opening claim text (preview).
What is claimed is: 1. An image segmentation method, comprising: obtaining an input image and a processing requirement, wherein the processing requirement is used to indicate to perform target processing on a target feature map group obtained by performing image segmentation on the input image; performing multi-layer feature extraction on the input image to obtain a plurality of feature maps; downsampling the plurality of feature maps to obtain a plurality of feature maps with a reference resolution, wherein the reference resolution is less than a resolution of the input image; fusing the plurality of feature maps with the reference resolution to obtain at least one feature map group; upsampling the at least one feature map group by using a transformation matrix W, to obtain the target feature map group, wherein the target feature map group has a same resolution as that of the input image, the transformation matrix W is obtained by modeling training data of an image segmentation task, and one dimension of the transformation matrix W is the same as a quantity of channels of the feature group; and performing the target processing on the target feature map group based on the processing requirement to obtain a target image. 2. The method according to claim 1 , wherein the upsampling the at least one feature map group by using the transformation matrix W, to obtain the target feature map group comprises: calculating a product of the transformation matrix W and each of (H×W) one-dimensional matrices that each comprise C elements, to obtain (H×W) one-dimensional matrices that each comprise P elements, wherein an element comprised in any one of the (H×W) one-dimensional matrices that each comprise C elements is an element at a same location in each of C two-dimensional (H×W) matrices comprised in the feature map group, H and W are two dimensions of the feature map group, C is the quantity of channels of the feature map group, the transformation matrix is a two-dimensional (C×P) matrix obtained based on M annotated images comprised in the training data, P=A×B×N, and N is a quantity of categories into which image semantics in the M annotated images are segmented; and separately performing feature permutation on the (H×W) one-dimensional matrices that each comprise P elements, to obtain the target feature map group, wherein at least one (A×B×N) submatrix comprised in the target feature map group is obtained based on one of the (H×W) one-dimensional matrices that each comprise P elements, and H, W, C, N, P, M, A, and B are all integers greater than 0. 3. The method according to claim 2 , wherein the separately performing feature permutation on the (H×W) one-dimensional matrices that each comprise P elements, to obtain the target feature map group comprises: determining, based on any one of the (H×W) one-dimensional matrices that each comprise P elements, (A×B) one-dimensional matrices that each comprise N elements; and using, as a submatrix comprised in the target feature map group, a three-dimensional (A×B×N) matrix obtained based on the (A×B) one-dimensional matrices that each comprise N elements. 4. The method according to claim 2 , wherein any one of the M annotated images is a three-dimensional (H×W×N) matrix, and the transformation matrix W is obtained by performing the following operations: obtaining at least one (A×B×N) submatrix corresponding to each of the M annotated images to obtain a plurality of (A×B×N) submatrices; obtaining, based on the plurality of (A×B×N) submatrices, a plurality of vectors comprising P elements, wherein a vector comprising P elements is obtained based on each of the plurality of (A×B×N) submatrices; performing principal component analysis on the plurality of vectors comprising P elements to obtain a two-dimensional (P×P) matrix; and using one (C×P) submatrix comprised in the two-dimensional (P×P) matrix as the transformation matrix W. 5. The method according to claim 1 , wherein the performing multi-layer feature extraction on the input image to obtain the plurality of feature maps comprises: performing a convolution operation on the input image to obtain a first feature map, and performing a convolution operation on a (K−1) th feature map to obtain a K th feature map, wherein the K th feature map is a feature map with the reference resolution, a resolution of the (K−1) th feature map is not greater than that of the K th feature map, K is an integer greater than 1, and the plurality of feature maps comprise K feature maps; and wherein the downsampling the plurality of feature maps to obtain the plurality of feature maps with the reference resolution comprises: downsampling the first feature map to obtain a feature map with the reference resolution, and downsampling the (K−1) th feature map to obtain a feature map with the reference resolution. 6. The method according to claim 2 , wherein the fusing the plurality of feature maps with the reference resolution to obtain the at least one feature map group comprises: stitching the plurality of feature maps with the reference resolution in a channel dimension to obtain the at least one feature map group, wherein the at least one feature map group is a three-dimensional (H×W×C) matrix and corresponds to the C two-dimensional (H×W) matrices; and wherein the calculating the product of the transformation matrix W and each of the (H×W) one-dimensional matrices that each comprise the C elements to obtain the (H×W) one-dimensional matrices that each comprise the P elements comprises: calculating a product of the transformation matrix and a one-dimensional matrix corresponding to each element location in the feature map group, to obtain the (H×W) one-dimensional matrices that each comprise P elements, wherein an element comprised in a one-dimensional matrix corresponding to one element location in the feature map group is an element at a same element location in each of the C two-dimensional (H×W) matrices. 7. The method according to claim 1 , further comprising: obtaining the transformation matrix W; processing a training sample by using a convolutional neural network, to obtain an image segmentation result of the training sample, wherein the training sample is comprised in the training data; determining, based on the image segmentation result of the training sample and a standard result corresponding to the training sample, a loss corresponding to the training sample, wherein the standard result is a result expected to be obtained by processing the training sample by using the convolutional neural network; and updating a parameter of the convolutional neural network by using an optimization algorithm and the loss corresponding to the training sample; wherein the performing multi-layer feature extraction on the input image to obtain the plurality of feature maps comprises: inputting the input image into the convolutional neural network and performing the multi-layer feature extraction, to obtain the plurality of feature maps. 8. An image processing apparatus, comprising: a processor; and a memory storing instructions that when executed by the processor configure the image processing apparatus to: obtain an input image and a processing requirement, wherein the processing requirement is used to indicate to perform target processing on a target feature map group obtained by performing image segmentation on the input image; and perform multi-layer feature extraction on the input image to obtain a plurality of feature maps; downsample the plurality of feature maps to obtain a plurality of feature maps with a reference resolution, wherein the reference resolution is less than a resolution of the input image; fuse the plurality of feature maps with the reference
Supervised learning · CPC title
Auto-encoder networks; Encoder-decoder networks · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title
using neural networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.