What technology area does this patent fall under?

Primary CPC classification G06F9/3001. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Dec 06 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Vector processing unit

US11520581B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11520581-B2
Application number	US-202117327957-A
Country	US
Kind code	B2
Filing date	May 24, 2021
Priority date	Mar 9, 2017
Publication date	Dec 6, 2022
Grant date	Dec 6, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A vector processing unit is described, and includes processor units that each include multiple processing resources. The processor units are each configured to perform arithmetic operations associated with vectorized computations. The vector processing unit includes a vector memory in data communication with each of the processor units and their respective processing resources. The vector memory includes memory banks configured to store data used by each of the processor units to perform the arithmetic operations. The processor units and the vector memory are tightly coupled within an area of the vector processing unit such that data communications are exchanged at a high bandwidth based on the placement of respective processor units relative to one another, and based on the placement of the vector memory relative to each processor unit.

First claim

Opening claim text (preview).

What is claimed is: 1. A system comprising: a plurality of vector processing units; and a plurality of matrix units coupled to the plurality of vector processing units such that data communications can be exchanged, each matrix unit being configured to perform multiplications between weights of a neural network and activation inputs to generate accumulated values, wherein each vector processing unit is arranged in a corresponding vector processing unit (VPU) lane, and wherein each vector processing unit comprises: a plurality of processor units arranged across multiple sub-lanes of the VPU lane, wherein each processor units comprises an arithmetic logic unit (ALU) configured to perform arithmetic operations associated with vectorized computations for a multi-dimensional data array; and a corresponding vector memory in data communication with the plurality of processor units, wherein the vector memory includes memory banks configured to store data used by the plurality of processor units to perform the arithmetic operations, wherein the plurality of processor units and the corresponding vector memory are tightly coupled within an area of the vector processing unit such that data communications can be exchanged at a high bandwidth based on the placement of respective processor units relative to one another and based on the placement of the vector memory relative to each processor unit. 2. The system of claim 1 , wherein each processor unit of the plurality of processor units comprises at least one ALU. 3. The system of claim 1 , wherein each vector processor unit comprises 16 ALUs. 4. The system of claim 1 , wherein the vector memory comprises static random access memory (SRAM). 5. The system of claim 1 , wherein the system is configured to allow transfer of 32 bytes between the vector memory and the plurality of processor units during a single clock cycle. 6. The system of claim 1 , wherein the vector processing unit is configured to perform vector computations based on concurrent use of two or more of the ALUs. 7. The system of claim 1 , wherein each ALU is configured to perform a 32-bit arithmetic operation between streams of vector data that represent operands for the arithmetic operation. 8. The system of claim 1 , wherein the plurality of matrix units and the plurality of vector processing units represent a processor core of an integrated circuit chip; and the processor core is confugured to processs a single instruction stream at least across the multiple sub-lanes. 9. The system of claim 1 , wherein: units of the system are configured to operate on streams of data; a first stream of data progresses in a first direction toward the plurality of matrix units; and a second, different stream of data progresses in a second direction away from the plurality of matrix units. 10. The system of claim 1 , wherein at least one processor unit comprises a plurality of ALUs, and wherein multiple ALUs within a single processor unit are configured to execute arithmetic operations simultaneously during a single processor clock cycle. 11. The system of claim 1 , wherein multiple ALUs within a single processor unit are configured to execute arithmetic operations simultaneously during a single processor clock cycle. 12. The system of claim 1 , wherein the system is configured to perform at least 2048 operations in a single clock cycle. 13. The system of claim 1 , wherein each operation includes a 32-bit word. 14. The system of claim 1 , wherein a VPU lane is configured to move 8 vectors from a corresponding memory unit of the VPU lane to 8 sub-lanes of the VPU lane within a single clock cycle.

Assignees

Google Llc

Inventors

Classifications

G06F13/4068
Electrical coupling · CPC title
G06F9/3891
organised in groups of units sharing resources, e.g. clusters · CPC title
G06N20/00
Machine learning · CPC title
G06N3/063
using electronic means · CPC title
G06F9/30087
Synchronisation or serialisation instructions · CPC title

Patent family

Related publications grouped by family.

View patent family 60201400

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11520581B2 cover?: A vector processing unit is described, and includes processor units that each include multiple processing resources. The processor units are each configured to perform arithmetic operations associated with vectorized computations. The vector processing unit includes a vector memory in data communication with each of the processor units and their respective processing resources. The vector memor…
Who is the assignee on this patent?: Google Llc
What technology area does this patent fall under?: Primary CPC classification G06F9/3001. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Dec 06 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).