Cross-task distillation to improve depth estimation
US-2023005165-A1 · Jan 5, 2023 · US
US12380328B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12380328-B2 |
| Application number | US-202318108956-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 13, 2023 |
| Priority date | Aug 30, 2022 |
| Publication date | Aug 5, 2025 |
| Grant date | Aug 5, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Provided is a lightweight model training method, an image processing method, a device and a medium. The lightweight model training method includes: acquiring first and second augmentation probabilities and a target weight adopted in an e-th iteration; performing data augmentation on a data set based on the first and second augmentation probabilities respectively, to obtain first and second data sets; obtaining a first output value of a student model and a second output value of a teacher model based on the first data set; obtaining a third output value and a fourth output value based on the second data set; determining a distillation loss function, a truth-value loss function and a target loss function; training the student model based on the target loss function; and determining a first augmentation probability or target weight to be adopted in an (e+1)-th iteration in a case of e is less than E.
Opening claim text (preview).
What is claimed is: 1. A lightweight model training method, comprising: acquiring a first augmentation probability, a second augmentation probability and a target weight adopted in an e-th iteration, the target weight being a weight of a distillation loss value, e being a positive integer not greater than E, and E being a maximum quantity of iterations and being a positive integer greater than 1; performing data augmentation on a data set based on the first augmentation probability and the second augmentation probability respectively, to obtain a first data set and a second data set; obtaining a first output value of a student model and a second output value of a teacher model based on the first data set; obtaining a third output value of the student model and a fourth output value of the teacher model based on the second data set, and the student model being a lightweight model; determining a distillation loss function based on the first output value and the second output value; determining a truth-value loss function based on the third output value and the fourth output value; determining a target loss function based on the distillation loss function and the truth-value loss function; training the student model based on the target loss function; and determining a first augmentation probability or target weight to be adopted in an (e+1)-th iteration in a case of e is less than E. 2. The method of claim 1 , further comprising: acquiring a maximum augmentation probability; and determining the second augmentation probability based on the maximum augmentation probability, the maximum quantity of iterations and the first augmentation probability. 3. The method of claim 2 , wherein determining the first augmentation probability to be adopted in the (e+1)-th iteration, comprises: determining the first augmentation probability to be adopted in the (e+1)-th iteration based on the maximum augmentation probability, the maximum quantity of iterations and the first augmentation probability of the e-th iteration. 4. The method of claim 1 , further comprising: acquiring a maximum target weight; wherein determining the target weight to be adopted in the (e+1)-th iteration, comprises: determining the target weight to be adopted in the (e+1)-th iteration based on the maximum target weight, the maximum quantity of iterations, and the target weight of the e-th iteration. 5. The method of claim 1 , wherein determining the target loss function based on the distillation loss function and the truth-value loss function, comprises: determining the distillation loss function as the target loss function in a case of the target weight is not less than the maximum target weight or the distillation loss function is not less than the truth-value loss function; and determining the truth-value loss function as the target loss function in other cases. 6. The method of claim 1 , wherein determining the distillation loss function based on the first output value and the second output value, comprises: determining the distillation loss function according to a formula as follow: l 1=( a+a dft ×2/ E )× L dist ( o 1 s,o 1 t )+(1− a−a dft ×2/ E )× L gt ( o 1 s,gt ); wherein l1 represents the distillation loss function, L dist (o1s,o1t) represents a distillation loss value determined according to the first output value and the second output value, L gt (o1s,gt) represents a truth-value loss value determined according to the first output value and a truth-value, a represents the target weight, a dft represents a maximum target weight, E represents the maximum quantity of iterations, gt represents the truth-value, o1s represents the first output value, and o1t represents the second output value. 7. The method of claim 1 , wherein determining the truth-value loss function based on the third output value and the fourth output value, comprises: determining the truth-value loss function according to a formula as follow: l 2= a×L dist ( o 2 s,o 2 t )+(1− a )× L gt ( o 2 s,gt ); wherein l2 represents the truth-value loss function, L dist (o2s,o2t) represents a distillation loss value determined according to the third output value and the fourth output value, L gt (o2s,gt) represents a truth-value loss value determined according to the third output value and a truth-value, a represents the target weight, gt represents the truth-value, o2s represents the third output value, and o2t represents the fourth output value. 8. An image processing method, comprising: receiving an image to be processed in a target scene; and inputting the image to be processed into a student model, to acquire a processed result of the image to be processed output by the student model; wherein the student model is obtained by adopting the lightweight model training method of claim 1 . 9. The method of claim 8 , wherein receiving the image to be processed in the target scene, comprises at least one of: acquiring an image to be processed in an image classification scene; acquiring an image to be processed in an image recognition scene; or acquiring an image to be processed in a target detection scene. 10. An electronic device, comprising: at least one processor; and a memory connected in communication with the at least one processor; wherein the memory stores an instruction executable by the at least one processor, and the instruction, when executed by the at least one processor, enables the at least one processor to execute operations, comprising: acquiring a first augmentation probability, a second augmentation probability and a target weight adopted in an e-th iteration, the target weight being a weight of a distillation loss value, e being a positive integer not greater than E, and E being a maximum quantity of iterations and being a positive integer greater than 1; performing data augmentation on a data set based on the first augmentation probability and the second augmentation probability respectively, to obtain a first data set and a second data set; obtaining a first output value of a student model and a second output value of a teacher model based on the first data set; obtaining a third output value of the student model and a fourth output value of the teacher model based on the second data set, and the student model being a lightweight model; determining a distillation loss function based on the first output value and the second output value; determining a truth-value loss function based on the third output value and the fourth output value; determining a target loss function based on the distillation loss function and the truth-value loss function; training the student model based on the target loss function; and determining a first augmentation probability or target weight to be adopted in an (e+1)-th iteration in a case of e is less than E. 11. The electronic device of claim 10 , wherein the operations further comprise: acquiring a maximum augmentation probability; and determining the second augmentation probability based on the maximum augmentation probability, the maximum quantity of iterations and the first augmentation probability. 12. The electronic device of claim 11 , wherein determining the first augmentation probability to be adopted in the (e+1)-th iteration, comprises: determining the first augmentation probability to be adopted in the (e+1)-th iteration based on the maximum augmentation probability, the maximum quantity of iterations and the first augmentation probability of the e-th iteration. 13. The electronic device of claim 10 , wherein the operations further comprise: acquiring a maximum target weight; wherein de
using neural networks · CPC title
Architecture, e.g. interconnection topology · CPC title
Learning methods · CPC title
modifying the architecture, e.g. adding, deleting or silencing nodes or connections · CPC title
the supervisor being an automated module, e.g. "intelligent oracle" · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.