Neural network layer-by-layer debugging

US2020410354A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2020410354-A1
Application numberUS-201916455329-A
CountryUS
Kind codeA1
Filing dateJun 27, 2019
Priority dateJun 27, 2019
Publication dateDec 31, 2020
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques are disclosed for debugging a neural network execution on a target processor. A reference processor may generate a plurality of first reference tensors for the neural network. The neural network may be repeatedly reduced to produce a plurality of lengths. For each of the lengths, a compiler converts the neural network into first machine instructions, the target processor executes the first machine instructions to generate a first device tensor, and the debugger program determines whether the first device tensor matches a first reference tensor. A shortest length is identified for which the first device tensor does not match the first reference tensor. Tensor output is enabled for a lower-level intermediate representation of the shortest neural network, and the neural network is converted into second machine instructions, which are executed by the target processor to generate a second device tensor.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method of debugging a neural network execution on a target processor, the method comprising: receiving, by a debugger program operating on a host system, a request to debug an execution of a neural network on the target processor, the neural network comprising a plurality of layers; generating, using a reference processor on the host system and based on a first sample input, a plurality of first reference tensors for the neural network; repeatedly reducing the plurality of layers of the neural network to produce a plurality of lengths, and for each particular length of a plurality of lengths: converting, by a compiler operating on the host system, the neural network having the particular length into first machine instructions; executing, using the target processor and based on the first sample input or on one of the plurality of first reference tensors, the first machine instructions to generate a first device tensor; and determining, by the debugger program, whether the first device tensor matches a first reference tensor of the plurality of first reference tensors; identifying a shortest length of the plurality of lengths for which the first device tensor does not match the first reference tensor; generating, using the reference processor and based on a second sample input, a plurality of second reference tensors for a lower-level representation of the neural network having the shortest length; converting, by the compiler, the neural network having the shortest length into second machine instructions, wherein the second machine instructions includes additional instructions that enable tensor output for the lower-level representation; executing, using the target processor and based on the second sample input or on one of the plurality of second reference tensors, the second machine instructions to generate a second device tensor for the lower-level representation; and determining, by the debugger program, whether the second device tensor matches a second reference tensor of the plurality of second reference tensors. 2 . The method of claim 1 , wherein the additional instructions enable tensor output for multiple lower-level representations of the neural network. 3 . The method of claim 2 , wherein executing the second machine instructions further generates a third device tensor for a second lower-level representation of the neural network, wherein the lower-level representation is a first lower-level representation. 4 . The method of claim 3 , further comprising: determining, by the debugger program, whether the third device tensor matches a third reference tensor of the plurality of second reference tensors. 5 . The method of claim 1 , wherein the plurality of first reference tensors and the plurality of second reference tensors are generated by the debugger program. 6 . A method of debugging a neural network execution on a target processor, the method comprising: receiving a plurality of first reference tensors for a neural network; repeatedly reducing a plurality of layers of the neural network to produce a plurality of lengths, and for each particular length of a plurality of lengths: converting, by a compiler, the neural network having the particular length into first machine instructions; executing, using the target processor, the first machine instructions to generate a first device tensor; and determining whether the first device tensor matches a first reference tensor of the plurality of first reference tensors; identifying a shortened length of the plurality of lengths for which the first device tensor does not match the first reference tensor; generating a plurality of second reference tensors for a lower-level representation of the neural network having the shortened length; converting, by the compiler, the neural network having the shortened length into second machine instructions; and executing, using the target processor, the second machine instructions to generate a second device tensor for the lower-level representation. 7 . The method of claim 6 , wherein the shortened length is a shortest length of the plurality of lengths. 8 . The method of claim 6 , further comprising: determining, by the debugger program, whether the second device tensor matches a second reference tensor of the plurality of second reference tensors. 9 . The method of claim 6 , wherein the second machine instructions include additional instructions that enable tensor output for the lower-level representation. 10 . The method of claim 9 , wherein the additional instructions enable tensor output for multiple lower-level representations of the neural network. 11 . The method of claim 10 , wherein executing the second machine instructions further generates a third device tensor for a second lower-level representation of the neural network, wherein the lower-level representation is a first lower-level representation. 12 . The method of claim 11 , further comprising: determining, by the debugger program, whether the third device tensor matches a third reference tensor of the plurality of second reference tensors. 13 . The method of claim 6 , wherein the plurality of first reference tensors and the plurality of second reference tensors are generated by the debugger program. 14 . A non-transitory computer-readable medium comprising instructions that, when executed by one or more processors, cause the one or more processors to perform operations including: receiving a plurality of first reference tensors for a neural network; repeatedly reducing a plurality of layers of the neural network to produce a plurality of lengths, and for each particular length of a plurality of lengths: converting, by a compiler, the neural network having the particular length into first machine instructions; executing, using the target processor, the first machine instructions to generate a first device tensor; and determining whether the first device tensor matches a first reference tensor of the plurality of first reference tensors; identifying a shortened length of the plurality of lengths for which the first device tensor does not match the first reference tensor; generating a plurality of second reference tensors for a lower-level representation of the neural network having the shortened length; converting, by the compiler, the neural network having the shortened length into second machine instructions; and executing, using the target processor, the second machine instructions to generate a second device tensor for the lower-level representation. 15 . The non-transitory computer-readable medium of claim 14 , wherein the shortened length is a shortest length of the plurality of lengths. 16 . The non-transitory computer-readable medium of claim 14 , wherein the operations further comprise: determining, by the debugger program, whether the second device tensor matches a second reference tensor of the plurality of second reference tensors. 17 . The non-transitory computer-readable medium of claim 14 , wherein the second machine instructions include additional instructions that enable tensor output for the lower-level representation. 18 . The non-transitory computer-readable medium of claim 17 , wherein the additional instructions enable tensor output for multiple lower-level representations of the neural network. 19 . The non-transitory computer-readable medium of claim 18 , wherein executing the second machine instructions further generates a third device tensor for a s

Assignees

Inventors

Classifications

  • Activation functions · CPC title

  • Combinations of networks · CPC title

  • G06N3/063Primary

    using electronic means · CPC title

  • Learning methods · CPC title

  • Recovery, e.g. branch miss-prediction, exception handling (error detection or correction G06F11/00) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2020410354A1 cover?
Techniques are disclosed for debugging a neural network execution on a target processor. A reference processor may generate a plurality of first reference tensors for the neural network. The neural network may be repeatedly reduced to produce a plurality of lengths. For each of the lengths, a compiler converts the neural network into first machine instructions, the target processor executes the…
Who is the assignee on this patent?
Amazon Tech Inc
What technology area does this patent fall under?
Primary CPC classification G06N3/063. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Dec 31 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).