Who is the assignee on this patent?

Beijing Baidu Netcom Sci & Tech Co Ltd

What technology area does this patent fall under?

Primary CPC classification G06V10/82. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Aug 05 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Lightweight model training method, image processing method, electronic device, and storage medium

US12380328B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12380328-B2
Application number	US-202318108956-A
Country	US
Kind code	B2
Filing date	Feb 13, 2023
Priority date	Aug 30, 2022
Publication date	Aug 5, 2025
Grant date	Aug 5, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Provided is a lightweight model training method, an image processing method, a device and a medium. The lightweight model training method includes: acquiring first and second augmentation probabilities and a target weight adopted in an e-th iteration; performing data augmentation on a data set based on the first and second augmentation probabilities respectively, to obtain first and second data sets; obtaining a first output value of a student model and a second output value of a teacher model based on the first data set; obtaining a third output value and a fourth output value based on the second data set; determining a distillation loss function, a truth-value loss function and a target loss function; training the student model based on the target loss function; and determining a first augmentation probability or target weight to be adopted in an (e+1)-th iteration in a case of e is less than E.

First claim

Opening claim text (preview).

What is claimed is: 1. A lightweight model training method, comprising: acquiring a first augmentation probability, a second augmentation probability and a target weight adopted in an e-th iteration, the target weight being a weight of a distillation loss value, e being a positive integer not greater than E, and E being a maximum quantity of iterations and being a positive integer greater than 1; performing data augmentation on a data set based on the first augmentation probability and the second augmentation probability respectively, to obtain a first data set and a second data set; obtaining a first output value of a student model and a second output value of a teacher model based on the first data set; obtaining a third output value of the student model and a fourth output value of the teacher model based on the second data set, and the student model being a lightweight model; determining a distillation loss function based on the first output value and the second output value; determining a truth-value loss function based on the third output value and the fourth output value; determining a target loss function based on the distillation loss function and the truth-value loss function; training the student model based on the target loss function; and determining a first augmentation probability or target weight to be adopted in an (e+1)-th iteration in a case of e is less than E. 2. The method of claim 1 , further comprising: acquiring a maximum augmentation probability; and determining the second augmentation probability based on the maximum augmentation probability, the maximum quantity of iterations and the first augmentation probability. 3. The method of claim 2 , wherein determining the first augmentation probability to be adopted in the (e+1)-th iteration, comprises: determining the first augmentation probability to be adopted in the (e+1)-th iteration based on the maximum augmentation probability, the maximum quantity of iterations and the first augmentation probability of the e-th iteration. 4. The method of claim 1 , further comprising: acquiring a maximum target weight; wherein determining the target weight to be adopted in the (e+1)-th iteration, comprises: determining the target weight to be adopted in the (e+1)-th iteration based on the maximum target weight, the maximum quantity of iterations, and the target weight of the e-th iteration. 5. The method of claim 1 , wherein determining the target loss function based on the distillation loss function and the truth-value loss function, comprises: determining the distillation loss function as the target loss function in a case of the target weight is not less than the maximum target weight or the distillation loss function is not less than the truth-value loss function; and determining the truth-value loss function as the target loss function in other cases. 6. The method of claim 1 , wherein determining the distillation loss function based on the first output value and the second output value, comprises: determining the distillation loss function according to a formula as follow: l 1=( a+a dft ×2/ E )× L dist ( o 1 s,o 1 t )+(1− a−a dft ×2/ E )× L gt ( o 1 s,gt ); wherein l1 represents the distillation loss function, L dist (o1s,o1t) represents a distillation loss value determined according to the first output value and the second output value, L gt (o1s,gt) represents a truth-value loss value determined according to the first output value and a truth-value, a represents the target weight, a dft represents a maximum target weight, E represents the maximum quantity of iterations, gt represents the truth-value, o1s represents the first output value, and o1t represents the second output value. 7. The method of claim 1 , wherein determining the truth-value loss function based on the third output value and the fourth output value, comprises: determining the truth-value loss function according to a formula as follow: l 2= a×L dist ( o 2 s,o 2 t )+(1− a )× L gt ( o 2 s,gt ); wherein l2 represents the truth-value loss function, L dist (o2s,o2t) represents a distillation loss value determined according to the third output value and the fourth output value, L gt (o2s,gt) represents a truth-value loss value determined according to the third output value and a truth-value, a represents the target weight, gt represents the truth-value, o2s represents the third output value, and o2t represents the fourth output value. 8. An image processing method, comprising: receiving an image to be processed in a target scene; and inputting the image to be processed into a student model, to acquire a processed result of the image to be processed output by the student model; wherein the student model is obtained by adopting the lightweight model training method of claim 1 . 9. The method of claim 8 , wherein receiving the image to be processed in the target scene, comprises at least one of: acquiring an image to be processed in an image classification scene; acquiring an image to be processed in an image recognition scene; or acquiring an image to be processed in a target detection scene. 10. An electronic device, comprising: at least one processor; and a memory connected in communication with the at least one processor; wherein the memory stores an instruction executable by the at least one processor, and the instruction, when executed by the at least one processor, enables the at least one processor to execute operations, comprising: acquiring a first augmentation probability, a second augmentation probability and a target weight adopted in an e-th iteration, the target weight being a weight of a distillation loss value, e being a positive integer not greater than E, and E being a maximum quantity of iterations and being a positive integer greater than 1; performing data augmentation on a data set based on the first augmentation probability and the second augmentation probability respectively, to obtain a first data set and a second data set; obtaining a first output value of a student model and a second output value of a teacher model based on the first data set; obtaining a third output value of the student model and a fourth output value of the teacher model based on the second data set, and the student model being a lightweight model; determining a distillation loss function based on the first output value and the second output value; determining a truth-value loss function based on the third output value and the fourth output value; determining a target loss function based on the distillation loss function and the truth-value loss function; training the student model based on the target loss function; and determining a first augmentation probability or target weight to be adopted in an (e+1)-th iteration in a case of e is less than E. 11. The electronic device of claim 10 , wherein the operations further comprise: acquiring a maximum augmentation probability; and determining the second augmentation probability based on the maximum augmentation probability, the maximum quantity of iterations and the first augmentation probability. 12. The electronic device of claim 11 , wherein determining the first augmentation probability to be adopted in the (e+1)-th iteration, comprises: determining the first augmentation probability to be adopted in the (e+1)-th iteration based on the maximum augmentation probability, the maximum quantity of iterations and the first augmentation probability of the e-th iteration. 13. The electronic device of claim 10 , wherein the operations further comprise: acquiring a maximum target weight; wherein de

Assignees

Beijing Baidu Netcom Sci & Tech Co Ltd

Inventors

Classifications

G06V10/82Primary
using neural networks · CPC title
G06N3/04
Architecture, e.g. interconnection topology · CPC title
G06N3/08Primary
Learning methods · CPC title
G06N3/082Primary
modifying the architecture, e.g. adding, deleting or silencing nodes or connections · CPC title
G06V10/7792
the supervisor being an automated module, e.g. "intelligent oracle" · CPC title

Patent family

Related publications grouped by family.

View patent family 84300516

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12380328B2 cover?: Provided is a lightweight model training method, an image processing method, a device and a medium. The lightweight model training method includes: acquiring first and second augmentation probabilities and a target weight adopted in an e-th iteration; performing data augmentation on a data set based on the first and second augmentation probabilities respectively, to obtain first and second data…
Who is the assignee on this patent?: Beijing Baidu Netcom Sci & Tech Co Ltd
What technology area does this patent fall under?: Primary CPC classification G06V10/82. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Aug 05 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Cross-task distillation to improve depth estimation

Efficient video processing via dynamic knowledge propagation

Mobile ai

Mobile ai

Device and method for compressing machine learning model

Frequently asked questions