Integrated circuit chip apparatus

US12136029B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12136029-B2
Application numberUS-202218085332-A
CountryUS
Kind codeB2
Filing dateDec 20, 2022
Priority dateDec 14, 2017
Publication dateNov 5, 2024
Grant dateNov 5, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An integrated circuit chip apparatus and a processing method performed by an integrated circuit chip apparatus are disclosed. The disclosed integrated circuit chip apparatus and processing method are used for executing a multiplication operation, a convolution operation, or a training operation of a neural network. The present technical solution has the advantages of a reduced computational cost and low power consumption.

First claim

Opening claim text (preview).

What is claimed is: 1. An integrated circuit chip apparatus, comprising: a main processing circuit and a plurality of basic processing circuits wherein the main processing circuit is configured to: receive an input data block, a weight data block, and a multiplication instruction; designate the input data block as a distribution data block and the weight data block as a broadcasting data block according to the multiplication instruction; partition the distribution data block to obtain a plurality of basic data blocks; distribute the plurality of basic data blocks respectively to at least one of the plurality of basic processing circuits; and broadcast the entire broadcasting data block to each of the plurality of basic processing circuits, wherein the at least one of the plurality of basic processing circuits is configured to perform computations on the same broadcasting data block and the respective received basic data blocks to obtain computation results, and transfer the computation results to the main processing circuit, wherein the main processing circuit is configured to process the computation results to obtain an instruction result of the multiplication instruction. 2. The integrated circuit chip apparatus of claim 1 , wherein the main processing circuit or at least one of the plurality of basic processing circuits includes a data type conversion circuit configured to convert data between a floating point type and a fixed point type. 3. The integrated circuit chip apparatus of claim 2 , wherein the main processing circuit is further configured to: convert the input data block and the weight data block to an input data block of the fixed point type and a weight data block of the fixed point type, respectively, using the data type conversion circuit. 4. The integrated circuit chip apparatus of claim 3 , wherein the at least one of the plurality of basic processing circuits is configured to perform the computations on the broadcasting data block and the received basic data blocks according to the fixed point type to obtain the computation results in fixed point type. 5. The integrated circuit chip apparatus of claim 4 , wherein the main processing circuit is configured to: convert the computation results of the fixed point type to the floating point type using the data type conversion circuit; accumulate the computation results of the floating point type to obtain accumulation results; and sort the accumulation results to obtain the instruction result. 6. The integrated circuit chip apparatus of claim 1 , wherein: the at least one of the plurality of basic processing circuits is configured to perform inner product computations on the broadcasting data block and the received basic data blocks to obtain inner products, and transfer the inner products as computation results to the main processing circuit, and the main processing circuit is configured to sort the inner products to obtain the instruction result. 7. The integrated circuit chip apparatus of claim 1 , wherein the basic processing circuits are further configured to: convert the basic data blocks and the broadcasting data block into data blocks of a fixed point type; and perform the computations on the basic data blocks and the broadcasting data block in the fixed point type to obtain fixed point computation results. 8. The integrated circuit chip apparatus of claim 7 , wherein the basic processing circuits are further configured to: convert the computation results from the fixed point type to a floating point type; and transfer the computation results in the floating point type to the main processing circuit. 9. The integrated circuit chip apparatus of claim 7 , wherein the basic processing circuits are further configured to: transfer the computation results in fixed point type to the main processing circuit, wherein the main processing circuit is further configured to: convert the computation results of the fixed point type to a floating point type; accumulate the computation results of the floating point type to obtain accumulation results; and sort the accumulation results to obtain the instruction result. 10. The integrated circuit chip apparatus of claim 1 , wherein the main processing circuit is configured to broadcast the broadcasting data block as a whole to the plurality of basic processing circuits. 11. The integrated circuit chip apparatus of claim 1 , wherein the main processing circuit is further configured to partition the broadcasting data block into a plurality of partial broadcasting data blocks, and sequentially broadcast the plurality of partial broadcasting data blocks to the plurality of basic processing circuits. 12. The integrated circuit chip apparatus of claim 1 , wherein the at least one of the plurality of basic processing circuits is configured to reuse each partial broadcasting data block for n times to perform the computations on the partial broadcasting data blocks and n basic data blocks respectively to obtain n partial processing results, and transfer the n partial processing results to the main processing circuit, wherein n is an integer greater than or equal to 2. 13. The integrated circuit chip apparatus of claim 1 , wherein the multiplication instruction is for performing a matrix-multiply-vector computation, and the main processing circuit is further configured to transfer data of at least one row of a matrix to a basic processing circuit at a time. 14. The integrated circuit chip apparatus of claim 1 , further comprising: a branch processing circuit, wherein the branch processing circuit is located between the main processing circuit and at least one basic processing circuit, wherein the branch processing circuit is configured to forward data between the main processing circuit and at least one basic processing circuit. 15. A neural network computation device, comprising one or more integrated circuit chip apparatuses, each integrated circuit chip apparatus comprising: a main processing circuit and a plurality of basic processing circuits, wherein the main processing circuit is configured to: receive an input data block, a weight data block, and a multiplication instruction; designate the input data block as a distribution data block and the weight data block as a broadcasting data block according to the multiplication instruction; partition the distribution data block to obtain a plurality of basic data blocks; distribute the plurality of basic data blocks respectively to at least one of the plurality of basic processing circuits; and broadcast the entire broadcasting data block to each of the plurality of basic processing circuits, wherein the at least one of the plurality of basic processing circuits is configured to perform computations on the same broadcasting data block and the respective received basic data blocks to obtain computation results, and transfer the computation results to the main processing circuit, wherein the main processing circuit is configured to process the computation results to obtain an instruction result of the multiplication instruction. 16. The neural network computation device of claim 15 , wherein the main processing circuit or at least one of the plurality of basic processing circuits includes a data type conversion circuit configured to convert data between a floating point data type and a fixed point data type. 17. A method for performing neural network operations using an integrated circuit chip apparatus comprising a main processing circuit, and a plurality of basic processing circuits, the method c

Assignees

Inventors

Classifications

  • Package configurations · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization (matrix transposition G06F7/78)} · CPC title

  • Multidimensional correlation or convolution · CPC title

  • Learning methods · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12136029B2 cover?
An integrated circuit chip apparatus and a processing method performed by an integrated circuit chip apparatus are disclosed. The disclosed integrated circuit chip apparatus and processing method are used for executing a multiplication operation, a convolution operation, or a training operation of a neural network. The present technical solution has the advantages of a reduced computational cos…
Who is the assignee on this patent?
Cambricon Tech Corp Ltd
What technology area does this patent fall under?
Primary CPC classification G06N3/063. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 05 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).