Training method for image semantic segmentation model and server
US-2021035304-A1 · Feb 4, 2021 · US
US12536767B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12536767-B2 |
| Application number | US-202318288552-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 8, 2023 |
| Priority date | Apr 13, 2023 |
| Publication date | Jan 27, 2026 |
| Grant date | Jan 27, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The present invention discloses a weakly supervised semantic segmentation method and device based on a commonality-specificity supervision mechanism, the contrastive convolution module is established to identify ambiguous boundary regions within the image based on the convolutional cognitive differences of different receptive fields within the image, overcoming the problem of blurred segmentation boundaries in weakly supervised semantic segmentation tasks; the commonality-specificity supervision module is established, using the commonality supervision mechanism to discover similar structural background distributions between different classes of images, the specificity supervision mechanism is used to identify prominent regions in the image distribution and achieve semantic segmentation of the target object, this not only improves the sparsity of the localization region, but also optimized the segmentation boundary; the knowledge gap module constructs the contrastive generated images with enhanced structural distribution, the knowledge gap between the contrastive generated images and the class images effectively overcomes the incomplete activation correspondence in mainstream methods and improves the weakly supervised semantic segmentation performance at the image level.
Opening claim text (preview).
The invention claimed is: 1 . A weakly supervised semantic segmentation method based on the commonality-specificity supervision mechanism, comprising the following steps: establishing a class 1 dataset and a class 2 dataset, wherein the class 1 dataset contains class 1 images and their image level labels, and the class 2 dataset contains class 2 images and their image level labels; establishing a weakly supervised semantic segmentation model, which includes an embedding layer, a contrastive convolution module, a commonality-specificity supervision module, a generator, a discriminator, and a knowledge gap module, the embedding layer is used to spatially maps the class 1 images and the class 2 images to obtain embedded representations, while the contrastive convolution module is used to spatially enhance the embedded representations to obtain enhanced distribution representations, the commonality-specificity supervision module is used to construct specificity supervision maps based on the enhanced distribution representations by using a commonality supervision mechanism, specificity supervision maps based on the commonality supervision maps are constructed by using a specificity supervision, a specific class target object region is constructed based on the commonality supervision maps and the specificity supervision map, a generator is used to generate contrast generated images based on the target object region, and the discriminator is used to distinguish the authenticity of the contrast generated images, the knowledge gap module is used to generate semantic segmentation results based on the class 1 images and their corresponding contrast generated images; establishing an objective function for the weakly supervised semantic segmentation model, the objective function comprises an adversarial loss for training the generator and discriminator and a consistency loss for constructing the structural consistency between the contrast generated images and the class 1 images based on semantic segmentation results; using the class 1 dataset and the class 2 dataset, and utilizing the objective function to optimize the parameters of the weakly supervised semantic segmentation model; using the weakly supervised semantic segmentation model optimized by parameters to segment the target image to be detected, and obtaining semantic segmentation annotations at the pixel level of the target image. 2 . The weakly supervised semantic segmentation method based on the commonality-specificity supervision mechanism according to claim 1 , wherein, the embedding layer comprises a sequentially connected boundary filling layer, a two-dimensional convolutional layer, an instance regularization layer, and a linear rectification activation layer, the embedded representations Embedding1 of the class 1 images and the embedded representations embedding2 of the class 2 images are obtained by spatially mapping the class1 images and the class 2 images through the embedding layer. 3 . The weakly supervised semantic segmentation method based on the commonality-specificity supervision mechanism according to claim 1 , wherein, the contrastive convolutional module comprises a dual channel mode, wherein, the first channel comprises a sequentially connected two-dimensional convolutional layer and a linear rectification activation layer, extracting the corresponding standard local representation S_Embedding1 based on the embedded representations Embedding1 of the Class 1 images, and extracting the corresponding standard local representation S_Embedding2 based on the embedded representations Embedding2 of the Class 2 images; the second channel comprises contrastive convolution, the contrastive convolution comprises an extended convolution layer and a two-dimensional convolution layer, extracting the corresponding difference representation D_Embedding1 based on the embedded representations Embedding 1 of the class 1 images, and extracting the corresponding difference table D_Embedding2 based on the embedded representations Embedding 2 of the class 2 images; the contrastive convolution module also comprises class activation maps calculation operation and enhanced representation calculation operation, specifically: utilizing the difference representations D_Embedding1 and the difference representations D_Embedding2 to calculate a class activation maps M1 ca corresponding to the class 1 images and a class activation maps D_Embedding2 corresponding to the class 2 images, respectively; making the class activation maps M1 ca and the standard local representation S_Embedding1 dot product to obtain the enhanced distribution representations E_Embedding1 of the class 1, and making the class activation maps M2 ca and the standard local representation S_Embedding2 dot product to obtain the enhanced distribution representations E_Embedding2. 4 . The weakly supervised semantic segmentation method based on the commonality-specificity supervision mechanism according to claim 1 , wherein, in the commonality-specificity supervision module, constructing the specificity supervision maps based on the enhanced distribution representations by using the commonality supervision mechanism, comprising: projecting the enhancement distribution representations E_Embedding1 onto a Reshape layer for size adjustment to obtain a reorganized distribution E_Embedding1 re of the class 1 images; projecting the enhancement distribution representations E_Embedding2 onto an average buffer, the enhancement distribution representations E_Embedding2 is arranged in order and the average value is calculated to represent a mean enhancement distribution representation E_Embedding2 ave of the class 2 images; projecting the enhanced distribution representation E_Embedding2 ave onto a SE layer to extract key structural features; calculating an element correlation matrix R for the class 1 images and the class 2 images based on the E_Embedding2 ave of the class 2 images and the reorganized distribution E_Embedding1 re of the class 1 images; calculating the specificity supervision maps M c based on the element correlation matrix R and the reorganized distribution E_Embedding1 re . 5 . The weakly supervised semantic segmentation method based on the commonality-specificity supervision mechanism according to claim 1 , wherein, in the commonality-specificity supervision module, the specificity supervision maps based on the commonality supervision maps are constructed by using the specificity supervision, comprising: reverse mapping the specificity supervision maps M c to obtain a reverse mapping maps M c ′, the calculation process is: M c ′ = { 1 , soft max ( M c ( i , j ) ) <
using classification, e.g. of video objects · CPC title
Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion · CPC title
Labelling scene content, e.g. deriving syntactic or semantic representations · CPC title
using neural networks · CPC title
Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.