Processing device and related products

US11775311B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11775311-B2
Application numberUS-201916663174-A
CountryUS
Kind codeB2
Filing dateOct 24, 2019
Priority dateAug 31, 2017
Publication dateOct 3, 2023
Grant dateOct 3, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A convolution operation method and a processing device for performing the same are provided. The method is performed by a processing device. The processing device includes a main processing circuit and a plurality of basic processing circuits. The basic processing circuits are configured to perform convolution operation in parallel. The technical solutions disclosed by the present disclosure can provide short operation time and low energy consumption.

First claim

Opening claim text (preview).

What is claimed is: 1. A convolution operation method, performed by a processing device comprising a main processing circuit and a plurality of basic processing circuits, the convolution operation method comprising: receiving, by the main processing circuit, an input data block and a weight data block, wherein: the input data block comprises input data; and the weight data block comprises weight data arranged as a plurality of convolution kernels; dividing, by the main processing circuit, the weight data block into a plurality of basic data blocks each including a portion of the weight data belonging to one of the plurality of convolution kernels; distributing, by the main processing circuit, the plurality of basic data blocks to the plurality of basic processing circuits; wherein each of the plurality of basic data blocks is distributed to one of the plurality of the basic processing circuits and at least two basic processing circuits receive different basic data blocks that include portions of the weight data belonging to different convolution kernels; broadcasting, by the main processing circuit, at least a portion of the input data block to the plurality of basic processing circuits, wherein each of the plurality of basic processing circuits receives the same portion of the input data block; performing, by each of the plurality of basic processing circuits, operations on the portion of the input data block broadcast to that basic processing circuit and one or more basic data blocks distributed to that basic processing circuit to obtain an operation result; providing, by the plurality of basic processing circuits, the respective operation results to the main processing circuit; and calculating, by the main processing circuit, a convolution operation result according to the operation results provided by the plurality of basic processing circuits, wherein each basic data block has a same size and broadcasting at least a portion of the input data block to the plurality of basic processing circuits includes: sliding, by the main processing circuit, an operation window that has a same size as each basic data block in the input data block; and extracting, by the main processing circuit, the portion of the input data block within the operation window at each sliding position for broadcasting to the plurality of basic processing circuits. 2. The convolution operation method of claim 1 , wherein: the input data in the input data block are arranged as a first four-dimensional data block, with H number of data in a first dimension, W number of data in a second dimension, C number of data in a third dimension, and N number of data in a fourth dimension of the first four-dimensional data block; and the weight data in the weight data block are arranged as a second four-dimensional data block, with KH number of data in a first dimension, KW number of data in a second dimension, C number of data in a third dimension, and M number of data in a fourth dimension of the second four-dimensional data block. 3. The convolution operation method of claim 1 , wherein: performing, by each of the plurality of basic processing circuits, the operations on the portion of the input data block broadcast to that basic processing circuit and one or more basic data blocks distributed to that basic processing circuit to obtain the operation result further includes: performing, by each of the plurality of basic processing circuits, multiplication operations on element values of the portion of the input data block and element values at corresponding positions of the one or more basic data blocks to obtain a plurality of multiplication results; and providing, by each of the basic processing circuits, the plurality of multiplication results to the main processing circuit; and calculating, by the main processing circuit, the convolution operation result includes: accumulating, by the main processing circuit, the plurality of multiplication results provided by each of the basic processing circuits to obtain a convolution result for each basic processing circuit; and sorting, by the main processing circuit, a plurality of convolution results to obtain the convolution operation result. 4. The convolution operation method of claim 1 , wherein: performing, by each of the plurality of basic processing circuits, the operations on the portion of the input data block broadcast to that basic processing circuit and one or more basic data blocks distributed to that basic processing circuit to obtain the operation result further includes: performing, by each of the plurality of basic processing circuits, multiplication operations on element values of the portion of the input data block and element values at corresponding positions of the one or more basic data blocks to obtain a plurality of multiplication results; accumulating, by each of the basic processing circuits, the plurality of multiplication results to obtain a convolution result; and sending, by each of the basic processing circuits; the convolution result to the main processing circuit; and calculating, by the main processing circuit, the convolution operation result further includes: sorting, by the main processing circuit, a plurality of convolution results provided by the plurality of basic processing circuits to obtain the convolution operation result. 5. The convolution operation method of claim 1 , wherein the processing device further includes branch processing circuits configured to connect the main processing circuit to the plurality of basic processing circuits, and the method further includes: transmitting, by the branch processing circuits, data among the main processing circuit and the plurality of basic processing circuits. 6. The convolution operation method of claim 2 , wherein a quantity of the plurality of convolution kernels is equal to M, and dividing the weight data block into a plurality of basic data blocks includes: dividing, by the main processing circuit, the weight data block into M basic data blocks each comprising a convolution kernel. 7. The convolution operation method of claim 2 , wherein a quantity of the plurality of convolution kernels is equal to M, each convolution kernel including C weight matrices, wherein dividing the weight data block into a plurality of basic data blocks includes: dividing, by the main processing circuit, the weight data block into a first quantity of basic data blocks each comprising a weight matrix, wherein the first quantity is equal to M multiplied by C. 8. The convolution operation method of claim 6 , wherein distributing the plurality of basic data blocks to the plurality of basic processing circuits includes: distributing, by the main processing circuit, multiple convolution kernels to at least one basic processing circuit when a number of the basic processing circuits is less than M; and distributing, by the main processing circuit, each of the convolution kernels to a separate basic processing circuit when the number of the basic processing circuits is equal to or larger than M. 9. A processing device comprising a main processing circuit and a plurality of basic processing circuits, wherein: the main processing circuit is configured to: receive an input data block and a weight data block for performing a convolution operation thereof wherein the input data block comprises input data and the weight data block comprises weight data arranged as a plurality of convolution kernels; divide the weight data block into a plurality of basic data blocks each including a portion of the weight data belonging to one of the plurality of convolution kernels; distribute the plurality of basic data blocks to the plurality of basic

Assignees

Inventors

Classifications

  • Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Feedforward networks · CPC title

  • G06F9/3885Primary

    using a plurality of independent parallel functional units · CPC title

  • Parallel decoding, e.g. parallel decode units · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11775311B2 cover?
A convolution operation method and a processing device for performing the same are provided. The method is performed by a processing device. The processing device includes a main processing circuit and a plurality of basic processing circuits. The basic processing circuits are configured to perform convolution operation in parallel. The technical solutions disclosed by the present disclosure ca…
Who is the assignee on this patent?
Cambricon Tech Corp Ltd
What technology area does this patent fall under?
Primary CPC classification G06F9/3885. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 03 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).