Model training, image processing method, device, storage medium, and program product

US11928563B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11928563-B2
Application numberUS-202117355347-A
CountryUS
Kind codeB2
Filing dateJun 23, 2021
Priority dateDec 18, 2020
Publication dateMar 12, 2024
Grant dateMar 12, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present application provides a model training, image processing method, device, storage medium, and program product relating to deep learning technology, which are able to screen auxiliary image data with image data for learning a target task, and further fuse the target image data and the auxiliary image data, so as to train a built and to-be-trained model with the fusion-processed fused image data. This implementation can increase the amount of data for training the model, and the data for training the model is determined is based on the target image data, which is suitable for learning the target task. Therefore, the solution provided by the present application can train an accurate target model even if the amount of target image data is not sufficient.

First claim

Opening claim text (preview).

What is claimed is: 1. A model training method, comprising: acquiring target image data for learning a target task, and source image data for learning a preset task; determining auxiliary image data in the source image data according to the target image data; determining a to-be-trained model according to a trained preset model corresponding to the preset task and a preset classification network; determining fused image data according to the target image data and the auxiliary image data, and training the to-be-trained model with the fused image data, to obtain a target model for executing the target task. 2. The method according to claim 1 , wherein the determining fused image data according to the target image data and the auxiliary image data comprises: acquiring a preset number of the target image data and the auxiliary image data; determining a plurality of data combinations, each data combination comprises one target image data and one auxiliary image data; and determining the fused image data according to a data combination. 3. The method according to claim 2 , wherein the determining a plurality of data combinations comprises: treating the target image data and the auxiliary image data in a same acquisition order as one data combination. 4. The method according to claim 2 , wherein the determining the fused image data according to the data combination comprises: fusing the target image and the auxiliary image belonging to a same data combination to determine a fused image; fusing a label of the target image and a label of the auxiliary image to determine a label of the fused image. 5. The method according to claim 4 , further comprising: acquiring random weight value; the fusing the target image and the auxiliary image belonging to a same data combination to determine a fused image comprises: fusing the target image and the auxiliary image with the random weight value to obtain the fused image; the fusing a label of the target image and a label of the auxiliary image to determine a label of the fused image comprises: fusing the label of the target image and the label of the auxiliary image with the random weight value to obtain the label of the fused image. 6. The method according to claim 5 , wherein, the fused image is M=α*S+(1−α)*T; and the label of the fused image is LM=α*LS+(1−α)*LT; wherein α is the random weight value, T is the target image, S is the auxiliary image, LT is the label of the target image, and LS is the label of the auxiliary image. 7. The method according to claim 5 , wherein the acquiring random weight value comprises: determining the random weight value based on β distribution. 8. The method according to claim 1 , wherein the determining auxiliary image data in the source image data according to the target image data comprises: inputting a target image of each target image data into a feature extraction network of the preset model to obtain a target feature corresponding to each target image; inputting a source image of each source image data into the feature extraction network of the preset model to obtain a source feature corresponding to each source image; determining the auxiliary image data in the source image data according to each target feature and each source feature. 9. The method according to claim 8 , wherein the determining the auxiliary image data in the source image data according to each target feature and each source feature comprises: determining a similarity between each target image and each source image according to each target feature and each source feature; determining source image data to which the source image similar to the target image belongs as the auxiliary image data according to the similarity. 10. The method according to claim 1 , wherein the determining a to-be-trained model according to a trained preset model corresponding to the preset task and a preset classification network comprises: determining the to-be-trained network according to a feature extraction network in the preset model and the preset classification network. 11. The method according to claim 10 , wherein an input dimension of the preset classification network is the same as an output dimension of the feature extraction network; and output dimension of the preset classification network is the same as the total number of labels comprised in the target image data and the auxiliary image data. 12. The method according to claim 1 , wherein the determining fused image data according to the target image data and the auxiliary image data, and training the to-be-trained model with the fused image data comprises: cyclically executing the following steps for preset times to obtain the target model: determining the fused image data according to the target image data and the auxiliary image data; training the to-be-trained model with the fused image data. 13. An image processing method, comprising: acquiring a to-be-processed image; recognizing the to-be-processed image according to a target model to determine a classification result corresponding to the to-be-processed image; wherein the target model is the target model trained by the model training method according to claim 1 . 14. An image processing apparatus, comprising: at least one processor; and a memory communicatively connected with the at least one processor; wherein, the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor is configured to: acquire a to-be-processed image; recognize the to-be-processed image according to a target model to determine a classification result corresponding to the to-be-processed image; wherein the target model is the target model trained by the model training method according to claim 1 . 15. A non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute the method according to claim 1 . 16. A model training apparatus, comprising: at least one processor; and a memory communicatively connected with the at least one processor; wherein, the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor is configured to: acquire target image data for learning a target task, and source image data for learning a preset task; determine auxiliary image data in the source image data according to the target image data; determine a to-be-trained model according to a trained preset model corresponding to the preset task and a preset classification network; determine fused image data according to the target image data and the auxiliary image data; train the to-be-trained model with the fused image data to obtain a target model for executing the target task. 17. The apparatus according to claim 16 , wherein the at least one processor is further configured to: acquire a preset number of the target image data and the auxiliary image data; determine a plurality of data combinations, each of which comprises one target image data and one auxiliary image data; determine the fused image data according to the data combination. 18. The apparatus according to claim 17 , wherein the at least one processor is further configured to: treat the target image data and the auxiliary image data in a same acquisition order as one data combination. 19. The apparatus according to claim 17 , wherein the at least on

Assignees

Inventors

Classifications

  • Supervised learning · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Transfer learning · CPC title

  • G06N20/00Primary

    Machine learning · CPC title

  • Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11928563B2 cover?
The present application provides a model training, image processing method, device, storage medium, and program product relating to deep learning technology, which are able to screen auxiliary image data with image data for learning a target task, and further fuse the target image data and the auxiliary image data, so as to train a built and to-be-trained model with the fusion-processed fused i…
Who is the assignee on this patent?
Beijing Baidu Netcom Sci & Tech Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06N20/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 12 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).