Method and apparatus for processing data, and related product

US2022121908A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2022121908-A1
Application numberUS-202117565008-A
CountryUS
Kind codeA1
Filing dateDec 29, 2021
Priority dateAug 28, 2019
Publication dateApr 21, 2022
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments of the present disclosure relate to a method and an apparatus for processing data, and related products. The embodiments of the present disclosure relate to a board card including a storage component, an interface apparatus, a control component, and an artificial intelligence chip, where the artificial intelligence chip is connected to the storage component, the control component and the interface apparatus respectively. The storage component is used to store data; the interface apparatus is used to realize data transmission between the artificial intelligence chip and the external device. The control component is used to monitor a state of the artificial intelligence chip. The board card may be used to perform artificial intelligence computations.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method for processing data, comprising: obtaining a group of data to be quantized for a machine learning model; quantizing the group of data to be quantized respectively through using a plurality of pairs of truncation thresholds to determine a plurality of groups of quantized data, wherein each pair of truncation thresholds in the plurality of pairs of truncation thresholds includes a truncation positive value and a truncation negative value that are symmetrical; and selecting a pair of truncation thresholds from the plurality of pairs of truncation thresholds based on a difference between a mean value of an absolute value of each group of quantized data in the plurality of groups of quantized data and a mean value of an absolute value of the group of data to be quantized to quantize the group of data to be quantized. 2 . The method of claim 1 , wherein determining the plurality of groups of quantized data includes: determining a maximum absolute value of all data in the group of data to be quantized; and determining the plurality of pairs of truncation thresholds based on the maximum absolute value. 3 . The method of claim 2 , wherein determining the plurality of groups of quantized data includes: determining a first truncation positive value based on the maximum absolute value, a predetermined total number of searches, and a current search order; quantizing the group of data to be quantized through using a first pair of truncation thresholds to determine a first group of quantized data, wherein the first pair of truncation thresholds includes a first truncation positive value and a first truncation negative value that is opposite to the first positive value; and determining a first difference between a mean value of an absolute value of the first group of quantized data and the mean value of the absolute value of the group of data to be quantized. 4 . The method of claim 3 , wherein determining the plurality of groups of quantized data includes: incrementing the current search order; determining a second truncation positive value based on the maximum absolute value, the predetermined total number of searches, and the current search order; quantizing the group of data to be quantized through using a second pair of truncation thresholds to determine a second group of quantized data, wherein the second pair of truncation thresholds includes a second truncation positive value and a second truncation negative value that is opposite to the second truncation positive value; and determining a second difference between a mean value of an absolute value of the second group of quantized data and the mean value of the absolute value of the group of data to be quantized. 5 . The method of claim 1 , wherein selecting the pair of truncation thresholds from the plurality of pairs of truncation threshold includes: determining, from the plurality of groups of quantized data, a group of quantized data that has a smallest difference with the group of data to be quantized in terms of mean value of absolute value; and selecting a pair of truncation thresholds corresponding to the group of quantized data from the plurality of pairs of truncation thresholds. 6 . The method of claim 5 , further comprising: determining a truncation search range associated with the selected pair of truncation thresholds; determining a plurality of new pairs of truncation thresholds within the truncation search range; quantizing the group of data to be quantized respectively through using the plurality of new pairs of truncation thresholds to determine a plurality of new groups of quantized data; and selecting a new pair of truncation thresholds from the plurality of new pairs of truncation thresholds based on a difference between the mean value of the absolute value of the group of data to be quantized and a mean value of an absolute value of each group of the plurality of new groups of quantized data. 7 . The method of claim 1 , wherein quantizing the group of data to be quantized respectively through using the plurality of pairs of truncation thresholds to determine the plurality of groups of quantized data includes: determining a maximum absolute value of all data in the group of data to be quantized; determining three pairs of truncation thresholds based on the maximum absolute value, wherein among the three pairs of truncation thresholds, a first pair of truncation thresholds includes a half of the maximum absolute value and an opposite of the half, and a second pair of truncation thresholds includes three-quarters of the maximum absolute value and an opposite of the three-quarters, and a third pair of truncation thresholds includes the maximum absolute value and an opposite of the maximum absolute value; and quantizing the group of data to be quantized respectively through using the three pairs of truncation thresholds to determine three groups of quantized data. 8 . The method of claim 7 , wherein selecting the pair of truncation thresholds from the plurality of pairs of truncation thresholds includes: executing the following actions iteratively until a stop condition is met: selecting the pair of truncation thresholds from the three pairs of truncation thresholds; determining whether a difference corresponding to the selected pair of truncation thresholds is less than a predetermined threshold; stopping the iterative execution of the actions in response to the difference being less than the predetermined threshold; and redetermining the three pairs of truncation thresholds in response to the difference being greater than the predetermined threshold based on the selected pair of truncation thresholds. 9 . The method of claim 1 , wherein the group of data to be quantized is a group of floating-point numbers in a neural network model, and the method further includes: quantize the group of data to be quantized using the selected pair of truncation thresholds to obtain quantized data, wherein the group of data to be quantized includes: setting a value that is greater than the truncation positive value in the group of data to be quantized as the truncation positive value, and setting a value that is less than the truncation negative value in the group of data to be quantized as the truncation negative value; and inputting the obtained quantized data to the neural network model for processing. 10 . An apparatus for data processing, comprising: a data to be quantized obtaining unit configured to obtain a group of data to be quantized for a machine learning model; a quantized data determining unit configured to quantize the group of data to be quantized to be quantized respectively by using a plurality of pairs of truncation thresholds to determine a plurality of groups of quantized data, wherein each pair of truncation thresholds in the plurality of pairs of truncation thresholds includes a truncation positive value and a truncation negative value that are symmetrical; and a truncation threshold selection unit configured to select a pair of truncation thresholds from the plurality of pairs of truncation thresholds based on a difference between a mean value of an absolute value of each group of quantized data and a mean value of an absolute value of the group of data to be quantized to quantize the group of data to be quantized. 11 . A computer readable storage medium, on which a computer program is stored, and when the program is executed, the method of claim 1 is realized. 12 . A computer readable storage medium, on which a computer program is stored, and when the program is executed, the method of claim 2 is realized.

Assignees

Inventors

Classifications

  • Activation functions · CPC title

  • G06N3/045Primary

    Combinations of networks · CPC title

  • Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

  • characterised by the process organisation or structure, e.g. boosting cascade · CPC title

  • Distances to prototypes · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2022121908A1 cover?
Embodiments of the present disclosure relate to a method and an apparatus for processing data, and related products. The embodiments of the present disclosure relate to a board card including a storage component, an interface apparatus, a control component, and an artificial intelligence chip, where the artificial intelligence chip is connected to the storage component, the control component an…
Who is the assignee on this patent?
Shanghai Cambricon Inf Tech Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06N3/045. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Apr 21 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).