Hardware platform specific operator fusion in machine learning

US2021182036A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2021182036-A1
Application numberUS-201916712449-A
CountryUS
Kind codeA1
Filing dateDec 12, 2019
Priority dateDec 12, 2019
Publication dateJun 17, 2021
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and associated apparatus for generating a neural network computation graph. The method includes receiving, by a compiler, a computation graph representing a neural network. The computation graph includes a plurality of nodes, each node associated with an operator of the neural network. The compiler receives a list of fusion patterns associated with a target hardware execution device, and analyzes the computation graph using the list of fusion patterns. The compiler generates one or more fused operators based on the analysis, each fused operator including at least two operators of the plurality of operators which can be fused. The compiler generates a new computation graph representing the neural network that includes at least a first fused operator of the generated one or more fused operators.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method comprising: receiving, by a compiler, a computation graph representing a neural network, the computation graph comprising a plurality of nodes, each node associated with an operator of the neural network; receiving, by the compiler, a list of fusion patterns associated with a target hardware execution device; analyzing, by the compiler, the computation graph using the list of fusion patterns; generating one or more fused operators based on the analysis, each fused operator comprising at least two operators of the plurality of operators which can be fused; and generating, by the compiler, a new computation graph representing the neural network that includes at least a first fused operator of the generated one or more fused operators. 2 . The method of claim 1 , further comprising determining, based on a cost model associated with the target hardware execution device, a computation cost associated with the generating of each of the one or more fused operators, and wherein the analyzing is based on the computation cost associated with the generating of each of the one or more fused operators. 3 . The method of claim 1 wherein each fusion pattern in the list of fusion patterns is associated with a condition for generating a fused operator. 4 . The method of claim 3 , wherein the condition relates to at least one of a memory allocation requirement associated with the fused operator, a size of a feature map input to a layer of the neural network, and a size of a filter of a layer of the neural network. 5 . The method of claim 4 , wherein the neural network includes a convolution layer and the condition specifies a constraint on at least one of a shape of a kernel of the convolution layer, a size of the kernel of convolution layer, and a data type of an execution kernel associated with the fused operator. 6 . The method of claim 1 , wherein each of the generated one or more fused operators specify a dataflow of computations which are equivalent to the dataflow of computations of the plurality of nodes of the computation graph representing the neural network. 7 . The method of claim 6 , further comprising outputting the generated one or more fused operators to the target hardware execution device for execution. 8 . The method of claim 7 , further comprising assigning priorities to each fusion pattern in the list of fusion patterns based on a cost model. 9 . The method of claim 8 , wherein the generated one or more fused operators are output to the target hardware execution device for execution in accordance with the priorities assigned to each fusion pattern in the list of fusion patterns. 10 . A non-transitory computer readable medium storing instructions executable in one or more processors, the instructions when executed in the one or more processors causing operations comprising: receiving, by a compiler, a computation graph representing a neural network, the computation graph comprising a plurality of operators of the neural network; receiving, by the compiler, a list of fusion patterns associated with a target hardware execution device; analyzing, by the compiler, the computation graph using the list of fusion patterns and generating one or more fused operators based on the analysis, each fused operator comprising at least two operators of the plurality of operators which can be fused; and generating, by the compiler, a new computation graph representing the neural network that includes at least a first fused operator of the generated one or more fused operators. 11 . The non-transitory computer readable medium of claim 10 , wherein the instructions are executable to cause operations comprising assigning priorities to each fusion pattern in the list of fusion patterns based on a cost model. 12 . The non-transitory computer readable medium of claim 10 , further comprising determining, based at least in accordance with the cost model, a computation cost associated with the generating of the one or more fused operators. 13 . The non-transitory computer readable medium of claim 12 , wherein, in accordance with the cost model, the computation cost is determined based on generating the one or more fused operators at the target hardware execution device. 14 . The non-transitory computer readable medium of claim 10 , wherein the list of fusion patterns specifies a condition for generating a fused operator based on the plurality of operators. 15 . The non-transitory computer readable medium of claim 14 , wherein the condition relates to at least one of a memory allocation requirement associated with the fused operator, an input feature relating to supported operator fusion patterns for the target hardware execution device, and a filter size in accordance with a neural network layer of the neural network. 16 . The non-transitory computer readable medium of claim 14 , wherein the condition specifies a constraint on at least one of a kernel shape, a kernel size and a data type of an underlying execution kernel associated with the fused operator. 17 . The non-transitory computer readable medium of claim 10 , wherein the generated one or more fused operators specify a flow of computations in accordance with a plurality of nodes of the neural network. 18 . The non-transitory computer readable medium of claim 17 , the instructions being executable to cause operations comprising providing the generated one or more fused operators to the target hardware execution device associated with a set of hardware platform specific patterns. 19 . The non-transitory computer readable medium of claim 17 , wherein the generated one or more fused operators are in accordance with a set of priorities assigned to each the set of hardware platform specific patterns as provided to the compiler. 20 . An apparatus comprising: a processor; and a memory storing instructions that when executed by the processor cause the apparatus to: receive a computation graph representing a neural network, the computation graph comprising a plurality of nodes, each node associated with an operator of the neural network; receive a list of fusion patterns associated with a target hardware execution device; analyze the computation graph using the list of fusion patterns; generate one or more fused operators based on the analysis, each fused operator comprising at least two operators of the plurality of operators which can be fused; and generate a new computation graph representing the neural network that includes at least a first fused operator of the generated one or more fused operators.

Assignees

Inventors

Classifications

  • Combinations of networks · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Knowledge engineering; Knowledge acquisition · CPC title

  • modifying the architecture, e.g. adding, deleting or silencing nodes or connections · CPC title

  • using electronic means · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2021182036A1 cover?
A method and associated apparatus for generating a neural network computation graph. The method includes receiving, by a compiler, a computation graph representing a neural network. The computation graph includes a plurality of nodes, each node associated with an operator of the neural network. The compiler receives a list of fusion patterns associated with a target hardware execution device, a…
Who is the assignee on this patent?
Shafiq Farhan, Tian Ye, Elhoushi Mostafa, and 2 more
What technology area does this patent fall under?
Primary CPC classification G06F8/447. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jun 17 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).