What technology area does this patent fall under?

Primary CPC classification G06N3/063. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Dec 31 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Neural network layer-by-layer debugging

US2020410354A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2020410354-A1
Application number	US-201916455329-A
Country	US
Kind code	A1
Filing date	Jun 27, 2019
Priority date	Jun 27, 2019
Publication date	Dec 31, 2020
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques are disclosed for debugging a neural network execution on a target processor. A reference processor may generate a plurality of first reference tensors for the neural network. The neural network may be repeatedly reduced to produce a plurality of lengths. For each of the lengths, a compiler converts the neural network into first machine instructions, the target processor executes the first machine instructions to generate a first device tensor, and the debugger program determines whether the first device tensor matches a first reference tensor. A shortest length is identified for which the first device tensor does not match the first reference tensor. Tensor output is enabled for a lower-level intermediate representation of the shortest neural network, and the neural network is converted into second machine instructions, which are executed by the target processor to generate a second device tensor.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method of debugging a neural network execution on a target processor, the method comprising: receiving, by a debugger program operating on a host system, a request to debug an execution of a neural network on the target processor, the neural network comprising a plurality of layers; generating, using a reference processor on the host system and based on a first sample input, a plurality of first reference tensors for the neural network; repeatedly reducing the plurality of layers of the neural network to produce a plurality of lengths, and for each particular length of a plurality of lengths: converting, by a compiler operating on the host system, the neural network having the particular length into first machine instructions; executing, using the target processor and based on the first sample input or on one of the plurality of first reference tensors, the first machine instructions to generate a first device tensor; and determining, by the debugger program, whether the first device tensor matches a first reference tensor of the plurality of first reference tensors; identifying a shortest length of the plurality of lengths for which the first device tensor does not match the first reference tensor; generating, using the reference processor and based on a second sample input, a plurality of second reference tensors for a lower-level representation of the neural network having the shortest length; converting, by the compiler, the neural network having the shortest length into second machine instructions, wherein the second machine instructions includes additional instructions that enable tensor output for the lower-level representation; executing, using the target processor and based on the second sample input or on one of the plurality of second reference tensors, the second machine instructions to generate a second device tensor for the lower-level representation; and determining, by the debugger program, whether the second device tensor matches a second reference tensor of the plurality of second reference tensors. 2 . The method of claim 1 , wherein the additional instructions enable tensor output for multiple lower-level representations of the neural network. 3 . The method of claim 2 , wherein executing the second machine instructions further generates a third device tensor for a second lower-level representation of the neural network, wherein the lower-level representation is a first lower-level representation. 4 . The method of claim 3 , further comprising: determining, by the debugger program, whether the third device tensor matches a third reference tensor of the plurality of second reference tensors. 5 . The method of claim 1 , wherein the plurality of first reference tensors and the plurality of second reference tensors are generated by the debugger program. 6 . A method of debugging a neural network execution on a target processor, the method comprising: receiving a plurality of first reference tensors for a neural network; repeatedly reducing a plurality of layers of the neural network to produce a plurality of lengths, and for each particular length of a plurality of lengths: converting, by a compiler, the neural network having the particular length into first machine instructions; executing, using the target processor, the first machine instructions to generate a first device tensor; and determining whether the first device tensor matches a first reference tensor of the plurality of first reference tensors; identifying a shortened length of the plurality of lengths for which the first device tensor does not match the first reference tensor; generating a plurality of second reference tensors for a lower-level representation of the neural network having the shortened length; converting, by the compiler, the neural network having the shortened length into second machine instructions; and executing, using the target processor, the second machine instructions to generate a second device tensor for the lower-level representation. 7 . The method of claim 6 , wherein the shortened length is a shortest length of the plurality of lengths. 8 . The method of claim 6 , further comprising: determining, by the debugger program, whether the second device tensor matches a second reference tensor of the plurality of second reference tensors. 9 . The method of claim 6 , wherein the second machine instructions include additional instructions that enable tensor output for the lower-level representation. 10 . The method of claim 9 , wherein the additional instructions enable tensor output for multiple lower-level representations of the neural network. 11 . The method of claim 10 , wherein executing the second machine instructions further generates a third device tensor for a second lower-level representation of the neural network, wherein the lower-level representation is a first lower-level representation. 12 . The method of claim 11 , further comprising: determining, by the debugger program, whether the third device tensor matches a third reference tensor of the plurality of second reference tensors. 13 . The method of claim 6 , wherein the plurality of first reference tensors and the plurality of second reference tensors are generated by the debugger program. 14 . A non-transitory computer-readable medium comprising instructions that, when executed by one or more processors, cause the one or more processors to perform operations including: receiving a plurality of first reference tensors for a neural network; repeatedly reducing a plurality of layers of the neural network to produce a plurality of lengths, and for each particular length of a plurality of lengths: converting, by a compiler, the neural network having the particular length into first machine instructions; executing, using the target processor, the first machine instructions to generate a first device tensor; and determining whether the first device tensor matches a first reference tensor of the plurality of first reference tensors; identifying a shortened length of the plurality of lengths for which the first device tensor does not match the first reference tensor; generating a plurality of second reference tensors for a lower-level representation of the neural network having the shortened length; converting, by the compiler, the neural network having the shortened length into second machine instructions; and executing, using the target processor, the second machine instructions to generate a second device tensor for the lower-level representation. 15 . The non-transitory computer-readable medium of claim 14 , wherein the shortened length is a shortest length of the plurality of lengths. 16 . The non-transitory computer-readable medium of claim 14 , wherein the operations further comprise: determining, by the debugger program, whether the second device tensor matches a second reference tensor of the plurality of second reference tensors. 17 . The non-transitory computer-readable medium of claim 14 , wherein the second machine instructions include additional instructions that enable tensor output for the lower-level representation. 18 . The non-transitory computer-readable medium of claim 17 , wherein the additional instructions enable tensor output for multiple lower-level representations of the neural network. 19 . The non-transitory computer-readable medium of claim 18 , wherein executing the second machine instructions further generates a third device tensor for a s

Assignees

Amazon Tech Inc

Inventors

Classifications

G06N3/048
Activation functions · CPC title
G06N3/045
Combinations of networks · CPC title
G06N3/063Primary
using electronic means · CPC title
G06N3/08
Learning methods · CPC title
G06F9/3861
Recovery, e.g. branch miss-prediction, exception handling (error detection or correction G06F11/00) · CPC title

Patent family

Related publications grouped by family.

View patent family 71728910

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2020410354A1 cover?: Techniques are disclosed for debugging a neural network execution on a target processor. A reference processor may generate a plurality of first reference tensors for the neural network. The neural network may be repeatedly reduced to produce a plurality of lengths. For each of the lengths, a compiler converts the neural network into first machine instructions, the target processor executes the…
Who is the assignee on this patent?: Amazon Tech Inc
What technology area does this patent fall under?: Primary CPC classification G06N3/063. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Dec 31 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).