Method and apparatus for training semantic segmentation model, computer device, and storage medium

US11398034B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11398034-B2
Application numberUS-201816759383-A
CountryUS
Kind codeB2
Filing dateJul 13, 2018
Priority dateApr 20, 2018
Publication dateJul 26, 2022
Grant dateJul 26, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and apparatus for training a semantic segmentation model, a computer device, and a storage medium are described herein. The method includes: constructing a training sample set; inputting the training sample set into a deep network model for training; inputting the training sample set into a weight transfer function for training to obtain a bounding box prediction mask parameter; and constructing a semantic segmentation model.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for training a semantic segmentation model, comprising: constructing a training sample set, wherein the training sample set comprises a plurality of first-category objects and a plurality of second-category objects, wherein the first-category objects are marked with bounding boxes and segmentation masks, and the second-category objects are marked with bounding boxes; inputting the training sample set into a deep network model for training to obtain first bounding box parameters and first mask parameters of the first-category objects and second bounding box parameters of the second-category objects; and inputting the first bounding box parameters and the first mask parameters into a weight transfer function for training to obtain a bounding box prediction mask parameter; and inputting the first bounding box parameters, the first mask parameters, the second bounding box parameters, and the bounding box prediction mask parameter into the deep network model and the weight transfer function to construct a semantic segmentation model; wherein a category number of the second-category objects is greater than that of the first-category objects; wherein the deep network model is a Mask-RCNN network model; wherein an expression of the weight transfer function is: ω seg c =τ(ω det c ;θ) w det c =[ w cls c ,w box c ] wherein τ denotes a transfer function, ω cls denotes a weight of a category, ω box denotes a weight of a bounding box, ω det denotes a merged vector, θ denotes a learning parameter of an unknown category, and ω seg denotes the bounding box prediction mask parameter. 2. The method for training a semantic segmentation model according to claim 1 , wherein after the step of inputting the first bounding box parameters, the first mask parameters, the second bounding box parameters, and the bounding box prediction mask parameter into the deep network model and the weight transfer function to construct a semantic segmentation model, the method comprises: inputting an image to be segmented into the semantic segmentation model to output a semantic segmentation result of the image to be segmented. 3. The method for training a semantic segmentation model according to claim 2 , wherein the step of inputting an image to be segmented into the semantic segmentation model to output a semantic segmentation result of the image to be segmented comprises: inputting the image to be segmented into the semantic segmentation model, predicting bounding boxes of the first-category objects in the image to be segmented by using the first bounding box parameters, and predicting bounding boxes of the second-category objects in the image to be segmented by using the second bounding box parameters; predicting mask parameters of the first-category objects in the image to be segmented by using the bounding boxes of the first-category objects and the bounding box prediction mask parameter, and predicting mask parameters of the second-category objects in the image to be segmented by using the bounding boxes of the second-category objects and the bounding box prediction mask parameter; and performing semantic segmentation on the first-category objects and the second-category objects in the image to be segmented by using the mask parameters of the first-category objects and the mask parameters of the second-category objects in the image to be segmented. 4. The method for training a semantic segmentation model according to claim 1 , wherein the weight transfer function is a two-layer fully connected neural network, wherein the two fully connected layers have 5120 neurons and 256 neurons, respectively, and an activation function used is LeakyReLU. 5. A computer device, comprising a memory storing computer readable instructions and a processor, wherein a method for training a semantic segmentation model is implemented when the processor executes the computer readable instructions, and the method comprises: constructing a training sample set, wherein the training sample set comprises first-category objects and second-category objects, wherein the first-category objects are marked with bounding boxes and segmentation masks, and the second-category objects are marked with bounding boxes; inputting the training sample set into a deep network model for training to obtain first bounding box parameters and first mask parameters of the first-category objects and second bounding box parameters of the second-category objects; and inputting the first bounding box parameters and the first mask parameters into a weight transfer function for training to obtain a bounding box prediction mask parameter; and inputting the first bounding box parameters, the first mask parameters, the second bounding box parameters, and the bounding box prediction mask parameter into the deep network model and the weight transfer function to construct a semantic segmentation model; wherein a category number of the second-category objects is greater than that of the first-category objects; wherein the deep network model is a Mask-RCNN network model; wherein an expression of the weight transfer function is: ω seg c =τ(ω det c ;θ) w det c =[ w cls c ,w box c ] wherein τ denotes a transfer function, ω cls denotes a weight of a category, ω box denotes a weight of a bounding box, ω det denotes a merged vector, θ denotes a learning parameter of an unknown category, and ω seg denotes the bounding box prediction mask parameter. 6. The computer device according to claim 5 , wherein after the step of inputting, by the processor, the first bounding box parameters, the first mask parameters, the second bounding box parameters, and the bounding box prediction mask parameter into the deep network model and the weight transfer function to construct a semantic segmentation model, the method comprises: inputting an image to be segmented into the semantic segmentation model to output a semantic segmentation result of the image to be segmented. 7. The computer device according to claim 6 , wherein the step of inputting, by the processor, an image to be segmented into the semantic segmentation model to output a semantic segmentation result of the image to be segmented comprises: inputting the image to be segmented into the semantic segmentation model, predicting bounding boxes of the first-category objects in the image to be segmented by using the first bounding box parameters, and predicting bounding boxes of the second-category objects in the image to be segmented by using the second bounding box parameters; predicting mask parameters of the first-category objects in the image to be segmented by using the bounding boxes of the first-category objects and the bounding box prediction mask parameter, and predicting mask parameters of the second-category objects in the image to be segmented by using the bounding boxes of the second-category objects and the bounding box prediction mask parameter; and performing semantic segmentation on the first-category objects and the second-category objects in the image to be segmented by using the mask parameters of the first-category objects and the mask parameters of the second-category objects in the image to be segmented. 8. The computer device according to claim 5 , wherein the weight transfer function is a two-layer fully connected neural network, wherein the two fully connected layers have 5120 neurons and 256 neurons, respectively, and an activation function used is LeakyReLU. 9. A non-transitory computer readable storage medium storing computer readable instructions, wherein a method for training a semantic segmentation model is implemented when the computer readable instructions are execu

Assignees

Inventors

Classifications

  • Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion · CPC title

  • Classification techniques · CPC title

  • using neural networks · CPC title

  • G06T7/11Primary

    Region-based segmentation · CPC title

  • G06T7/10Primary

    Segmentation; Edge detection (motion-based segmentation G06T7/215) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11398034B2 cover?
A method and apparatus for training a semantic segmentation model, a computer device, and a storage medium are described herein. The method includes: constructing a training sample set; inputting the training sample set into a deep network model for training; inputting the training sample set into a weight transfer function for training to obtain a bounding box prediction mask parameter; and co…
Who is the assignee on this patent?
Ping An Tech Shenzhen Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06T7/11. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 26 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).