Neural network model training method, image processing method, and apparatus

US12555362B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12555362-B2
Application numberUS-202318316365-A
CountryUS
Kind codeB2
Filing dateMay 12, 2023
Priority dateNov 13, 2020
Publication dateFeb 17, 2026
Grant dateFeb 17, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The technology of this application relates to a neural network model training method, an image processing method, and an apparatus in the artificial intelligence. The training method includes each of at least one first accelerator training a neural network model based on at least one training sample. Before forward computation at an i th layer is performed, different parameters of the i th layer are obtained locally and from another accelerator, to obtain a complete model parameter of the i th layer. According to the method in this application, storage pressure of the first accelerator can be reduced.

First claim

Opening claim text (preview).

What is claimed is: 1 . A neural network model training method, comprising: obtaining, by at least one first accelerator, at least one training sample; obtaining a forward computation result by performing, by the at least one first accelerator, forward computation of a neural network model on the at least one training sample, wherein before performing the forward computation at an i th layer in the neural network model, the at least one first accelerator obtains a complete model parameter of the i th layer by obtaining different parameters of the i th layer locally and from another accelerator, wherein i is a positive integer; obtaining a first parameter gradient of the neural network model by performing, by the at least one first accelerator, backward computation based on the forward computation result; and updating, by the at least one first accelerator, a parameter of the neural network model based on the first parameter gradient of the neural network model. 2 . The method according to claim 1 , further comprising: after performing the forward computation at the i th layer in the neural network model, releasing, by the at least one first accelerator, a parameter of the i th layer obtained from the another accelerator. 3 . The method according to claim 1 , wherein before performing the backward computation at a j th layer in the neural network model, the at least one first accelerator obtains a complete model parameter of the i th layer by obtaining different parameters of the j th layer locally and from another first accelerator, wherein j is a positive integer. 4 . The method according to claim 1 , wherein in a time period in which the at least one first accelerator performs the forward computation at any one or more layers before the i th layer in the neural network model, the at least one first accelerator obtains the complete model parameter of the i th layer by obtaining the different parameters of the i th layer locally and from the another accelerator. 5 . The method according to claim 1 , wherein the at least one first accelerator is located in a first server. 6 . The method according to claim 1 , further comprising: sending, by the at least one first accelerator, the first parameter gradient to the another accelerator. 7 . The method according to claim 6 , wherein the at least one first accelerator sends a parameter gradient of a k th layer in the first parameter gradient to the another accelerator in a time period in which the at least one first accelerator performs the backward computation at any one or more layers before the k th layer in the neural network model, wherein k is a positive integer. 8 . The method according to claim 1 , further comprising: receiving, by the at least one first accelerator, a second parameter gradient of the neural network model sent by the another accelerator; and updating, by the at least one first accelerator, the parameter of the neural network model based on the first parameter gradient of the neural network model comprises: updating, by the at least one first accelerator, the parameter of the neural network model based on the first parameter gradient of the neural network model and the second parameter gradient of the neural network model. 9 . An image processing method, comprising: obtaining, by a second accelerator, a to-be-processed image; and obtaining a processing result of the to-be-processed image by performing, by the second accelerator, forward computation of a target neural network model on the to-be-processed image, wherein before performing the forward computation at a p th layer in the target neural network model, the second accelerator obtains a complete model parameter of the p th layer by obtaining different parameters of the p th layer locally and from another accelerator, wherein p is a positive integer. 10 . The method according to claim 9 , wherein after performing the forward computation at the p th layer in the target neural network model, the second accelerator releases a parameter of the p th layer obtained from the another accelerator. 11 . The method according to claim 9 , wherein in a time period in which the second accelerator performs the forward computation at any one or more layers before the p th layer in the target neural network model, the second accelerator obtains the complete model parameter of the p th layer by obtaining the different parameters of the p th layer locally and from the another accelerator. 12 . The method according to claim 9 , further comprising: obtaining a parameter of the target neural network model by updating a parameter of a neural network model by at least one first accelerator based on a first parameter gradient of the neural network model; obtaining the first parameter gradient of the neural network model by performing backward computation by the at least one first accelerator based on a forward computation result; obtaining the forward computation result by performing the forward computation of the neural network model on at least one training sample by the at least one first accelerator; and obtaining a complete model parameter of an i th layer in the neural network model by obtaining different parameters of the i th layer locally and from the another accelerator. 13 . The method according to claim 12 , further comprising: when the at least one first accelerator performs the backward computation at a j th layer in the neural network model, obtaining a complete model parameter of the j th layer in the neural network model by obtaining different parameters of the j th layer locally and from the another accelerator. 14 . The method according to claim 13 , further comprising obtaining the complete model parameter of the j th layer in a time period in which the at least one first accelerator performs the backward computation at any one or more layers after the j th layer in the neural network model. 15 . The method according to claim 12 , wherein the parameter of the target neural network model being obtained by updating the parameter of the neural network model by the at least one first accelerator based on the first parameter gradient of the neural network model comprises: obtaining the parameter of the target neural network model by updating the parameter of the neural network model by the at least one first accelerator based on the first parameter gradient of the neural network model and a second parameter gradient of the neural network model, wherein the second parameter gradient of the neural network model comprises a parameter gradient sent by the another accelerator and received by the at least one first accelerator. 16 . A neural network model training apparatus, comprising: a processor; and a memory configured to store computer readable instructions that, when executed by the processor, cause the apparatus to: obtain at least one training sample; obtain a forward computation result by performing forward computation of a neural network model on the at least one training sample, wherein before performing the forward computation at an i th layer in the neural network model, a complete model parameter of the i th layer is obtained by obtaining different parameters of the i th layer locally and from another accelerator, wherein i is a positive integer; obtain a first parameter gradient of the neural network model by performing backward computation based on the forward computation result; and update a parameter of the neural network model based on the first parameter gradient of the neural n

Assignees

Inventors

Classifications

  • Backpropagation, e.g. using gradient descent · CPC title

  • using neural networks · CPC title

  • using electronic means · CPC title

  • Supervised learning · CPC title

  • Recurrent networks, e.g. Hopfield networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12555362B2 cover?
The technology of this application relates to a neural network model training method, an image processing method, and an apparatus in the artificial intelligence. The training method includes each of at least one first accelerator training a neural network model based on at least one training sample. Before forward computation at an i th layer is performed, different parameters of the i th la…
Who is the assignee on this patent?
Huawei Tech Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06V10/774. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 17 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).