Neural network method and apparatus

US12511528B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12511528-B2
Application numberUS-201916249279-A
CountryUS
Kind codeB2
Filing dateJan 16, 2019
Priority dateJul 4, 2018
Publication dateDec 30, 2025
Grant dateDec 30, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A neural network method and apparatus are provided. A processor implemented neural network includes calculating respective individual gradient values for updating a weight of a neural network, calculating a residual gradient value based on an accumulated gradient value obtained by accumulating the individual gradient values and a bit digit representing the weight, tuning the respective individual gradient values to correspond to a bit digit of the residual gradient value, summing the tuned respective individual gradient values, the residual gradient value, and the weight, and updating the weight and the residual gradient value based on a result of the summing to train the neural network.

First claim

Opening claim text (preview).

What is claimed is: 1 . A processor-implemented neural network method, the method comprising: generating one or more respective individual gradient values for training a neural network by updating a weight of the neural network; tuning the one or more respective individual gradient values to correspond to bit digits of a residual gradient value, where each respective individual gradient value has respective bit digits; generating an intermediate summation value using the tuned one or more respective individual gradient values and the residual gradient value; summing the weight and the intermediate summation value; generating an updated residual gradient value by updating the residual gradient value to be a portion of the intermediate summation value not overlapping bit digits of the weight and storing the updated residual gradient value in an accumulation buffer; generating an updated weight by updating the weight with a portion of the intermediate summation value overlapping the bit digits of the weight, in response to the intermediate summation value overlapping bit digits of the weight, to train the neural network; and generating a trained neural network by training the neural network using the updated residual gradient value and updated weight, wherein the residual gradient value is dependent on an accumulating of one or more previous individual gradient values for updating the weight in a previous time. 2 . The method of claim 1 , wherein the updating of the residual gradient value comprises: determining an effective gradient value dependent on the result of the summing, where the effective gradient value has a value divisible by the least significant bit digit of the weight; and updating the residual gradient value by subtracting the effective gradient value from the result of the summing. 3 . The method of claim 1 , wherein the tuning of the one or more respective individual gradient values comprises: quantizing each of the one or more respective individual gradient values, including omitting respective values of the one or more respective individual gradient values that are less than a least significant bit digit of the residual gradient value; and padding each of the quantized one or more respective individual gradient values, wherein a value up to a bit digit corresponding to a most significant bit digit of the residual gradient value is present in each padded quantized one or more respective individual gradient values. 4 . The method of claim 1 , wherein the summing comprises: mapping the tuned one or more respective individual gradient values and the residual gradient value based on a set bit number, and calculating the intermediate summation value based on the mapped tuned one or more respective individual gradient values and the mapped residual gradient value; and mapping the weight based on the set bit number and summing the intermediate summation value and the weight. 5 . The method of claim 1 , wherein the summing comprises: padding the tuned one or more respective individual gradient values, the residual gradient value, and the weight; and summing the padded weight and the padded intermediate summation value of the padded tuned one or more respective individual gradient values and the padded residual gradient value. 6 . The method of claim 1 , wherein the updating of the weight comprises updating a bit digit value of a portion of the result of the summing, corresponding to the bit digit representing the weight, to the updated weight, and wherein the updating of the residual gradient value comprises updating a bit digit value of a remaining portion of the result of the summing, not corresponding to the bit digit representing the weight, to the residual gradient value. 7 . The method of claim 1 , further comprising: obtaining a sign bit that is a Most Significant Bit of the result of the summing; and adding the obtained sign bit such that the obtained sign bit is a Most Significant Bit of the updated weight and/or the updated residual gradient value. 8 . A non-transitory computer-readable recording medium having recorded thereon computer readable instructions, which, when executed by one or more processors, performs the method of claim 1 . 9 . A processor-implemented neural network method, the method comprising: generating one or more respective individual gradient values for training a neural network by updating a weight of the neural network; tuning the one or more respective individual gradient values to correspond to bit digits of a residual gradient value, where each respective bit individual gradient value has respective bit digits; generate an intermediate concatenation value by concatenating a remaining value of the residual gradient value, excluding a sign bit, to the weight; summing the tuned one or more respective individual gradient values and the intermediate concatenation value; generate an updated residual gradient value by updating the residual gradient value to be a portion of the intermediate concatenation value not overlapping bit digits of the weight; generate an updated weight by updating the weight with a portion of the intermediate concatenation value overlapping the bit digits of the weight, in response to a summation of the tuned one or more respective individual gradient values overlapping bit digits of the weight, to train the neural network; and generating a trained neural network by training the neural network using the updated residual gradient value and updated weight, wherein the residual gradient value is dependent on an accumulating of one or more previous individual gradient values for updating the weight in a previous time. 10 . The method of claim 9 , wherein the updating of the residual gradient value comprises: determining an effective gradient value dependent on the result of the summing, where the effective gradient value has a value divisible by the least significant bit digit of the weight; and updating the residual gradient value by subtracting the effective gradient value from the result of the summing. 11 . The method of claim 9 , wherein the tuning of the one or more respective individual gradient values comprises: quantizing each of the one or more respective individual gradient values, including omitting respective values of the one or more respective individual gradient values that are less than a least significant bit digit of the residual gradient value; and padding each of the quantized one or more respective individual gradient values, wherein a value up to a bit digit corresponding to a most significant bit digit of the residual gradient value is present in each padded quantized one or more respective individual gradient values. 12 . The method of claim 9 , wherein the summing comprises: mapping the tuned one or more respective individual gradient values and the intermediate concatenation value based on a set bit number, and summing the mapped tuned one or more respective individual gradient values and the mapped intermediate concatenation value. 13 . A non-transitory computer-readable recording medium having recorded thereon computer readable instructions, which, when executed by one or more processors, causes the one or more processors to perform the method of claim 9 . 14 . The method of claim 9 , wherein the summing comprises: padding the tuned one or more respective individual gradient values and the intermediate concatenation value; and summing the padded tuned one or more respective individual gradient values and the padded intermediate concatenation value. 15 .

Assignees

Inventors

Classifications

  • Architecture, e.g. interconnection topology · CPC title

  • Adding; Subtracting (G06F7/483 - G06F7/491, G06F7/544 - G06F7/556 take precedence) · CPC title

  • Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system or floating-point numbers {(G06F7/4806, G06F7/4824, G06F7/49, G06F7/491, G06F7/544 take precedence)} · CPC title

  • G06N3/08Primary

    Learning methods · CPC title

  • G06N3/0495Primary

    Quantised networks; Sparse networks; Compressed networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12511528B2 cover?
A neural network method and apparatus are provided. A processor implemented neural network includes calculating respective individual gradient values for updating a weight of a neural network, calculating a residual gradient value based on an accumulated gradient value obtained by accumulating the individual gradient values and a bit digit representing the weight, tuning the respective individu…
Who is the assignee on this patent?
Samsung Electronics Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06N3/08. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 30 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 11 related publications on this page (citations in our corpus or others sharing the same primary CPC).