Image segmentation method and image processing apparatus

US12008797B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12008797-B2
Application numberUS-202117383181-A
CountryUS
Kind codeB2
Filing dateJul 22, 2021
Priority dateMar 1, 2019
Publication dateJun 11, 2024
Grant dateJun 11, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

This application discloses an image segmentation method in the field of artificial intelligence. The method includes: obtaining an input image and a processing requirement; performing multi-layer feature extraction on the input image to obtain a plurality of feature maps; downsampling the plurality of feature maps to obtain a plurality of feature maps with a reference resolution, where the reference resolution is less than a resolution of the input image; fusing the plurality of feature maps with the reference resolution to obtain at least one feature map group; upsampling the feature map group by using a transformation matrix W, to obtain a target feature map group; and performing target processing on the target feature map group based on the processing requirement to obtain a target image.

First claim

Opening claim text (preview).

What is claimed is: 1. An image segmentation method, comprising: obtaining an input image and a processing requirement, wherein the processing requirement is used to indicate to perform target processing on a target feature map group obtained by performing image segmentation on the input image; performing multi-layer feature extraction on the input image to obtain a plurality of feature maps; downsampling the plurality of feature maps to obtain a plurality of feature maps with a reference resolution, wherein the reference resolution is less than a resolution of the input image; fusing the plurality of feature maps with the reference resolution to obtain at least one feature map group; upsampling the at least one feature map group by using a transformation matrix W, to obtain the target feature map group, wherein the target feature map group has a same resolution as that of the input image, the transformation matrix W is obtained by modeling training data of an image segmentation task, and one dimension of the transformation matrix W is the same as a quantity of channels of the feature group; and performing the target processing on the target feature map group based on the processing requirement to obtain a target image. 2. The method according to claim 1 , wherein the upsampling the at least one feature map group by using the transformation matrix W, to obtain the target feature map group comprises: calculating a product of the transformation matrix W and each of (H×W) one-dimensional matrices that each comprise C elements, to obtain (H×W) one-dimensional matrices that each comprise P elements, wherein an element comprised in any one of the (H×W) one-dimensional matrices that each comprise C elements is an element at a same location in each of C two-dimensional (H×W) matrices comprised in the feature map group, H and W are two dimensions of the feature map group, C is the quantity of channels of the feature map group, the transformation matrix is a two-dimensional (C×P) matrix obtained based on M annotated images comprised in the training data, P=A×B×N, and N is a quantity of categories into which image semantics in the M annotated images are segmented; and separately performing feature permutation on the (H×W) one-dimensional matrices that each comprise P elements, to obtain the target feature map group, wherein at least one (A×B×N) submatrix comprised in the target feature map group is obtained based on one of the (H×W) one-dimensional matrices that each comprise P elements, and H, W, C, N, P, M, A, and B are all integers greater than 0. 3. The method according to claim 2 , wherein the separately performing feature permutation on the (H×W) one-dimensional matrices that each comprise P elements, to obtain the target feature map group comprises: determining, based on any one of the (H×W) one-dimensional matrices that each comprise P elements, (A×B) one-dimensional matrices that each comprise N elements; and using, as a submatrix comprised in the target feature map group, a three-dimensional (A×B×N) matrix obtained based on the (A×B) one-dimensional matrices that each comprise N elements. 4. The method according to claim 2 , wherein any one of the M annotated images is a three-dimensional (H×W×N) matrix, and the transformation matrix W is obtained by performing the following operations: obtaining at least one (A×B×N) submatrix corresponding to each of the M annotated images to obtain a plurality of (A×B×N) submatrices; obtaining, based on the plurality of (A×B×N) submatrices, a plurality of vectors comprising P elements, wherein a vector comprising P elements is obtained based on each of the plurality of (A×B×N) submatrices; performing principal component analysis on the plurality of vectors comprising P elements to obtain a two-dimensional (P×P) matrix; and using one (C×P) submatrix comprised in the two-dimensional (P×P) matrix as the transformation matrix W. 5. The method according to claim 1 , wherein the performing multi-layer feature extraction on the input image to obtain the plurality of feature maps comprises: performing a convolution operation on the input image to obtain a first feature map, and performing a convolution operation on a (K−1) th feature map to obtain a K th feature map, wherein the K th feature map is a feature map with the reference resolution, a resolution of the (K−1) th feature map is not greater than that of the K th feature map, K is an integer greater than 1, and the plurality of feature maps comprise K feature maps; and wherein the downsampling the plurality of feature maps to obtain the plurality of feature maps with the reference resolution comprises: downsampling the first feature map to obtain a feature map with the reference resolution, and downsampling the (K−1) th feature map to obtain a feature map with the reference resolution. 6. The method according to claim 2 , wherein the fusing the plurality of feature maps with the reference resolution to obtain the at least one feature map group comprises: stitching the plurality of feature maps with the reference resolution in a channel dimension to obtain the at least one feature map group, wherein the at least one feature map group is a three-dimensional (H×W×C) matrix and corresponds to the C two-dimensional (H×W) matrices; and wherein the calculating the product of the transformation matrix W and each of the (H×W) one-dimensional matrices that each comprise the C elements to obtain the (H×W) one-dimensional matrices that each comprise the P elements comprises: calculating a product of the transformation matrix and a one-dimensional matrix corresponding to each element location in the feature map group, to obtain the (H×W) one-dimensional matrices that each comprise P elements, wherein an element comprised in a one-dimensional matrix corresponding to one element location in the feature map group is an element at a same element location in each of the C two-dimensional (H×W) matrices. 7. The method according to claim 1 , further comprising: obtaining the transformation matrix W; processing a training sample by using a convolutional neural network, to obtain an image segmentation result of the training sample, wherein the training sample is comprised in the training data; determining, based on the image segmentation result of the training sample and a standard result corresponding to the training sample, a loss corresponding to the training sample, wherein the standard result is a result expected to be obtained by processing the training sample by using the convolutional neural network; and updating a parameter of the convolutional neural network by using an optimization algorithm and the loss corresponding to the training sample; wherein the performing multi-layer feature extraction on the input image to obtain the plurality of feature maps comprises: inputting the input image into the convolutional neural network and performing the multi-layer feature extraction, to obtain the plurality of feature maps. 8. An image processing apparatus, comprising: a processor; and a memory storing instructions that when executed by the processor configure the image processing apparatus to: obtain an input image and a processing requirement, wherein the processing requirement is used to indicate to perform target processing on a target feature map group obtained by performing image segmentation on the input image; and perform multi-layer feature extraction on the input image to obtain a plurality of feature maps; downsample the plurality of feature maps to obtain a plurality of feature maps with a reference resolution, wherein the reference resolution is less than a resolution of the input image; fuse the plurality of feature maps with the reference

Assignees

Inventors

Classifications

  • Supervised learning · CPC title

  • Auto-encoder networks; Encoder-decoder networks · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title

  • using neural networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12008797B2 cover?
This application discloses an image segmentation method in the field of artificial intelligence. The method includes: obtaining an input image and a processing requirement; performing multi-layer feature extraction on the input image to obtain a plurality of feature maps; downsampling the plurality of feature maps to obtain a plurality of feature maps with a reference resolution, where the refe…
Who is the assignee on this patent?
Huawei Tech Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06F18/253. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 11 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).