What technology area does this patent fall under?

Primary CPC classification G06F17/16. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jun 27 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Depth concatenation using a matrix computation unit

US9691019B1 · US · B1

Patent metadata
Field	Value
Publication number	US-9691019-B1
Application number	US-201715452624-A
Country	US
Kind code	B1
Filing date	Mar 7, 2017
Priority date	Mar 7, 2017
Publication date	Jun 27, 2017
Grant date	Jun 27, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for depth concatenation using a matrix computation unit. One of the methods includes: receiving a request to process network inputs to a neural network using an integrated circuit, the neural network comprising a depth concatenation neural network layer; and generating instructions that, when executed by the integrated circuit, cause the integrated circuit to performing operations comprising: for each spatial location in a first input tensor to the depth concatenation layer and a second input tensor to the depth concatenation layer: multiplying, using the matrix computation unit, a second depth vector for the spatial location by a shift weight matrix for the depth concatenation layer to generate a shifted second depth vector; and adding the shifted second depth vector and a first input depth vector for the spatial location to generate a concatenated depth vector.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving a request to process network inputs to a neural network using an integrated circuit that performs neural network computations in hardware using a matrix computation unit, the neural network comprising a depth concatenation neural network layer that specifies a concatenation of an input tensor having dimensions x 1 by y 1 by z 1 and an input tensor having dimensions x 1 by y 1 by z 2 along a depth dimension to generate an output tensor having dimensions x 1 by y 1 by (z 1 +z 2 ); and generating instructions that, when executed by the integrated circuit, cause the integrated circuit to, during processing of a network input by the neural network, generate a layer output tensor that satisfies the specification of the depth concatenation neural network layer by performing operations comprising: for each spatial location in a first input tensor to the depth concatenation layer and a second input tensor to the depth concatenation layer: multiplying, using the matrix computation unit, a second depth vector for the spatial location in the second input tensor by a shift weight matrix for the depth concatenation layer to generate a shifted second depth vector that has zeroes as the first z 1 entries and entries of the second depth vector as the last z 2 entries; and adding the shifted second depth vector and a first input depth vector for the spatial location in the first input tensor to generate a concatenated depth vector, the first input depth vector having entries of the first input depth vector as the first z 1 entries of the first input depth vector and zeroes as the last z 2 entries of the first input depth vector. 2. The method of claim 1 , the operations further comprising: moving the first input depth vector to a set of output sum-in registers of the matrix computation unit; and wherein adding the shifted second depth vector and the first input depth vector comprises: moving the shifted second depth vector into the set of output sum-in registers of the matrix computation unit while the first input depth vector is stored in the set of output sum-in registers of the matrix computation unit. 3. The method of claim 2 , wherein moving the first input depth vector comprises: multiplying the first input depth vector by a modified identity weight matrix for the depth concatenation layer using the matrix computation unit. 4. The method of claim 3 , further comprising: generating the modified identity weight matrix for the depth concatenation layer; and storing the modified identity weight matrix for the depth concatenation layer in a memory accessible to the special-purpose integrated circuit. 5. The method of claim 1 , further comprising: generating the shift weight matrix for the depth concatenation layer; and storing the shift weight matrix for the depth concatenation layer in a memory accessible to the special-purpose integrated circuit. 6. The method of claim 5 , further comprising: determining that the number of depth dimensions in the output tensor does not exceed a maximum vector length for the matrix computation unit; and generating the shift weight matrix for the depth concatenation in response to determining that the number of depth dimensions in the output tensor does not exceed the maximum vector length for the matrix computation unit. 7. The method of claim 1 , wherein the shift weight matrix for the depth concatenation layer is a (z 1 +z 2 ) by (z 1 +z 2 ) matrix having all entries be zero except for a diagonal row of ones starting at the first entry of the z 2 -th column of the matrix. 8. A system comprising one or more computers and one or more storage devices storing first instructions that when executed by the one or more computers cause the one or more computers to perform first operations comprising: receiving a request to process network inputs to a neural network using an integrated circuit that performs neural network computations in hardware using a matrix computation unit, the neural network comprising a depth concatenation neural network layer that specifies a concatenation of an input tensor having dimensions x 1 by y 1 by z 1 and an input tensor having dimensions x 1 by y 1 by z 2 along a depth dimension to generate an output tensor having dimensions x 1 by y 1 by (z 1 +z 2 ); and generating second instructions that, when executed by the integrated circuit, cause the integrated circuit to, during processing of a network input by the neural network, generate a layer output tensor that satisfies the specification of the depth concatenation neural network layer by performing second operations comprising: for each spatial location in a first input tensor to the depth concatenation layer and a second input tensor to the depth concatenation layer: multiplying, using the matrix computation unit, a second depth vector for the spatial location in the second input tensor by a shift weight matrix for the depth concatenation layer to generate a shifted second depth vector that has zeroes as the first z 1 entries followed by entries of the second depth vector; and adding the shifted second depth vector and a first input depth vector for the spatial location in the first input tensor to generate a concatenated depth vector, the first input depth vector having entries of the first input depth vector as the first z 1 entries of the first input depth vector. 9. The system of claim 8 , the second operations further comprising: moving the first input depth vector to a set of output sum-in registers of the matrix computation unit; and wherein adding the shifted second depth vector and the first input depth vector comprises: moving the shifted second depth vector into the set of output sum-in registers of the matrix computation unit while the first input depth vector is stored in the set of output sum-in registers of the matrix computation unit. 10. The system of claim 9 , wherein moving the first input depth vector comprises: multiplying the first input depth vector by a modified identity weight matrix for the depth concatenation layer using the matrix computation unit. 11. The system of claim 10 , the first operations further comprising: generating the modified identity weight matrix for the depth concatenation layer; and storing the modified identity weight matrix for the depth concatenation layer in a memory accessible to the special-purpose integrated circuit. 12. The system of claim 8 , the first operations further comprising: generating the shift weight matrix for the depth concatenation layer; and storing the shift weight matrix for the depth concatenation layer in a memory accessible to the special-purpose integrated circuit. 13. The system of claim 12 , the first operations further comprising: determining that the number of depth dimensions in the output tensor does not exceed a maximum vector length for the matrix computation unit; and generating the shift weight matrix for the depth concatenation in response to determining that the number of depth dimensions in the output tensor does not exceed the maximum vector length for the matrix computation unit. 14. The system of claim 8 , wherein the shift weight matrix for the depth concatenation layer is a matrix having all entries be zero except for a diagonal row of ones starting at the first entry of the z 2 -th column of the matrix. 15. One or more non-transitory computer storage media encoded with first instructions that when executed by one or more computers cause the one or more computers to perform first operations comprising: receiving

Assignees

Google Inc

Inventors

Classifications

G06F17/16Primary
Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization (matrix transposition G06F7/78)} · CPC title
G06N3/063Primary
using electronic means · CPC title
G06N3/04Primary
Architecture, e.g. interconnection topology · CPC title
G06N3/0635
Physics · mapped topic
G06F9/46
Multiprogramming arrangements · CPC title

Patent family

Related publications grouped by family.

View patent family 59069613

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9691019B1 cover?: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for depth concatenation using a matrix computation unit. One of the methods includes: receiving a request to process network inputs to a neural network using an integrated circuit, the neural network comprising a depth concatenation neural network layer; and generating instructions that, when execute…
Who is the assignee on this patent?: Google Inc
What technology area does this patent fall under?: Primary CPC classification G06F17/16. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jun 27 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).