Processing computational graphs

US10534997B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10534997-B2
Application numberUS-201815965742-A
CountryUS
Kind codeB2
Filing dateApr 27, 2018
Priority dateOct 28, 2015
Publication dateJan 14, 2020
Grant dateJan 14, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for receiving a request from a client to process a computational graph; obtaining data representing the computational graph, the computational graph comprising a plurality of nodes and directed edges, wherein each node represents a respective operation, wherein each directed edge connects a respective first node to a respective second node that represents an operation that receives, as input, an output of an operation represented by the respective first node; identifying a plurality of available devices for performing the requested operation; partitioning the computational graph into a plurality of subgraphs, each subgraph comprising one or more nodes in the computational graph; and assigning, for each subgraph, the operations represented by the one or more nodes in the subgraph to a respective available device in the plurality of available devices for operation.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving a request from a client to process a computational graph; obtaining data representing the computational graph, the computational graph comprising a plurality of nodes and directed edges, wherein each node represents a respective inference or training operation for a neural network, and wherein each directed edge connects a respective first node to a respective second node that represents an operation that receives, as input, an output of an operation represented by the respective first node; identifying a plurality of available devices for processing the computational graph; and processing the computational graph using the plurality of available devices, comprising: analyzing the computational graph to identify groups of nodes that operate on shared data flowing on directed edges to the group of nodes, partitioning the computational graph into a plurality of subgraphs, each subgraph comprising one or more nodes in the computational graph, comprising generating, for each identified group of nodes, a respective subgraph including the identified group of nodes, assigning, for each subgraph, the operations represented by the one or more nodes in the subgraph to a respective available device in the plurality of available devices for processing, and, processing the computational graph by causing each device to perform the operations assigned to the device. 2. The method of claim 1 , wherein the request specifies one or more particular outputs from one or more respective nodes, further comprising: receiving, from a device to which the one or more respective nodes are assigned, the one or more particular outputs; and providing the one or more particular outputs to the client. 3. The method of claim 1 , wherein the request comprises labels partitioning the computational graph into a plurality of predetermined subgraphs, and wherein partitioning the computational graph comprises partitioning the computational graph into the plurality of predetermined subgraphs. 4. The method of claim 1 , wherein each device in the plurality of available devices is a hardware resource that performs operations independent of other devices in the plurality of available devices. 5. The method of claim 1 , wherein assigning, for each subgraph, the operations represented by the one or more nodes in the subgraph to a respective device comprises assigning the operations to a device having a computational capability necessary to perform the operations represented by the nodes in the subgraph. 6. The method of claim 1 , further comprising: analyzing the computational graph to identify groups of nodes arranged in a chain structure, and wherein the partitioning comprises generating, for each identified group of nodes arranged in a chain structure, a respective subgraph including the identified group of nodes arranged in a chain structure. 7. The method of claim 1 , further comprising: prior to assigning, for each subgraph, the operations represented by the one or more nodes in the subgraph to a respective available device in the plurality of available devices, determining an initial assignment of operations represented by one or more initial nodes in subgraphs to available devices, wherein the one or more initial nodes are nodes from which data starts flowing in the respective subgraph for the one or more initial nodes; monitoring the devices to determine statistics; adjusting the initial assignment using the statistics; and reassigning the operations of the subgraphs to the devices based on the adjusted initial assignment. 8. The method of claim 7 , further comprising repeating the monitoring, adjusting, and reassigning until a threshold amount of improvement has been achieved. 9. The method of claim 7 , wherein the statistics comprise a respective operation time or a respective idle time for each subgraph. 10. The method of claim 1 , wherein the plurality of available devices comprises: a device having a central processing unit (CPU), and at least one device having a processing unit that is not a CPU. 11. The method of claim 10 , wherein assigning, for each subgraph, the operations represented by the one or more nodes in the subgraph to a respective available device in the plurality of available devices for processing comprises: determining, for each subgraph, that the operations represented by the one or more nodes in the subgraph can only be processed by a particular type of processing unit; and in response, assigning the subgraph to a respective available device in the plurality of available devices having the particular type of processing unit. 12. A system comprising: one or more computers; and computer-readable medium coupled to the one or more computers and having instructions stored thereon, which, when executed by the one or more computers, cause the one or more computers to perform operations comprising: obtaining data representing a computational graph, the computational graph comprising a plurality of nodes and directed edges, wherein each node represents a respective inference or training operation for a neural network, and wherein each directed edge connects a respective first node to a respective second node that represents an operation that receives, as input, an output of an operation represented by the respective first node; identifying a plurality of available devices for processing the computational graph; and processing the computational graph using the plurality of available devices, comprising: analyzing the computational graph to identify groups of nodes that operate on shared data flowing on directed edges to the group of nodes, partitioning the computational graph into a plurality of subgraphs, each subgraph comprising one or more nodes in the computational graph, comprising generating, for each identified group of nodes, a respective subgraph including the identified group of nodes, assigning, for each subgraph, the operations represented by the one or more nodes in the subgraph to a respective available device in the plurality of available devices for processing, and, processing the computational graph by causing each device to perform the operations assigned to the device. 13. The system of claim 12 , the operations further comprising: prior to assigning, for each subgraph, the operations represented by the one or more nodes in the subgraph to a respective available device in the plurality of available devices, determining an initial assignment of operations represented by one or more initial nodes in subgraphs to available devices, wherein the one or more initial nodes are nodes from which data starts flowing in the respective subgraph for the one or more initial nodes; monitoring the devices to determine statistics; adjusting the initial assignment using the statistics; and reassigning the operations of the subgraphs to the devices based on the adjusted initial assignment. 14. The system of claim 13 , the operations further comprising repeating the monitoring, adjusting, and reassigning until a threshold amount of improvement has been achieved. 15. The system of claim 13 , wherein the statistics comprise a respective operation time or a respective idle time for each subgraph. 16. The system of claim 12 , wherein the plurality of available devices comprises: a device having a central processing unit (CPU), and at least one device having a processing unit that is not a CPU. 17. The system of claim 16 , wherein assigning, for each subgraph, the operations represented by the one or m

Assignees

Inventors

Classifications

  • G06F9/5066Primary

    Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs (mappping at compile time, see G06F8/451) · CPC title

  • Distributed learning, e.g. federated learning · CPC title

  • Machine learning · CPC title

  • G06F9/5005Primary

    to service a request · CPC title

  • Inference or reasoning models · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10534997B2 cover?
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for receiving a request from a client to process a computational graph; obtaining data representing the computational graph, the computational graph comprising a plurality of nodes and directed edges, wherein each node represents a respective operation, wherein each directed edge connects a respectiv…
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G06F9/5066. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 14 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).