Vector reductions using shared scratchpad memory

US11934826B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11934826-B2
Application numberUS-202117530869-A
CountryUS
Kind codeB2
Filing dateNov 19, 2021
Priority dateFeb 26, 2020
Publication dateMar 19, 2024
Grant dateMar 19, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems, and apparatus, including computer-readable media, are described for performing vector reductions using a shared scratchpad memory of a hardware circuit having processor cores that communicate with the shared memory. For each of the processor cores, a respective vector of values is generated based on computations performed at the processor core. The shared memory receives the respective vectors of values from respective resources of the processor cores using a direct memory access (DMA) data path of the shared memory. The shared memory performs an accumulation operation on the respective vectors of values using an operator unit coupled to the shared memory. The operator unit is configured to accumulate values based on arithmetic operations encoded at the operator unit. A result vector is generated based on performing the accumulation operation using the respective vectors of values.

First claim

Opening claim text (preview).

What is claimed is: 1. A method performed using an integrated circuit for a hardware machine-learning accelerator that includes a plurality of cores and a shared memory that communicates with each of the plurality of cores, the method comprising: generating, by each of the plurality of cores, a respective vector of values; performing, across the plurality of cores and into a shared memory cell in the shared memory, a plurality of atomic vector reductions using each of the respective vectors and an operator unit of the shared memory without synchronization; and generating a result vector based on the plurality of atomic vector reductions. 2. The method of claim 1 , wherein performing the plurality of atomic vector reductions comprises: accumulating a first vector stored in the shared memory cell with a respective second vector generated by one or more of the plurality of cores. 3. The method of claim 1 , wherein: each of the plurality of cores comprises a respective vector-processing unit; and generating a respective vector of values comprises: generating, by each of the vector-processing units, a respective vector of values. 4. The method of claim 3 , wherein each of the operator unit and the shared memory is external to the respective vector-processing unit in each of the plurality of cores. 5. An integrated circuit for a hardware machine-learning accelerator, the integrated circuit comprising: a plurality of cores; a shared memory that communicates with each of the plurality of cores; and a non-transitory machine-readable storage device for storing instructions that are executable by a processor to cause performance of operations comprising: generating, by each of the plurality of cores, a respective vector of values; performing, across the plurality of cores and into a shared memory cell in the shared memory, a plurality of atomic vector reductions using each of the respective vectors and an operator unit of the shared memory without synchronization; and generating a result vector based on the plurality of atomic vector reductions. 6. The integrated circuit of claim 5 , wherein performing the plurality of atomic vector reductions comprises: accumulating a first vector stored in the shared memory cell with a respective second vector generated by one or more of the plurality of cores. 7. The integrated circuit of claim 5 , wherein: each of the plurality of cores comprises a respective vector-processing unit; and generating a respective vector of values comprises: generating, by each of the vector-processing units, a respective vector of values. 8. The integrated circuit of claim 7 , wherein each of the operator unit and the shared memory is external to the respective vector-processing unit in each of the plurality of cores.

Assignees

Inventors

Classifications

  • Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • using a mask · CPC title

  • Arithmetic instructions · CPC title

  • to perform operations on memory · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11934826B2 cover?
Methods, systems, and apparatus, including computer-readable media, are described for performing vector reductions using a shared scratchpad memory of a hardware circuit having processor cores that communicate with the shared memory. For each of the processor cores, a respective vector of values is generated based on computations performed at the processor core. The shared memory receives the r…
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G06F9/30036. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 19 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).