Device placement optimization with reinforcement learning

US10692003B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10692003-B2
Application numberUS-201916445330-A
CountryUS
Kind codeB2
Filing dateJun 19, 2019
Priority dateMar 24, 2017
Publication dateJun 23, 2020
Grant dateJun 23, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for determining a placement for machine learning model operations across multiple hardware devices is described. The method includes receiving data specifying a machine learning model to be placed for distributed processing on multiple hardware devices; generating, from the data, a sequence of operation embeddings, each operation embedding in the sequence characterizing respective operations necessary to perform the processing of the machine learning model; processing the sequence of operation embeddings using a placement recurrent neural network in accordance with first values of a plurality network parameters of the placement recurrent neural network to generate a network output that defines a placement of the operations characterized by the operation embeddings in the sequence across the plurality of devices; and scheduling the machine learning model for processing by the multiple hardware devices by placing the operations on the multiple devices according to the placement defined by the network output.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: receiving data specifying a machine learning model to be placed for distributed processing on a plurality of hardware devices; generating, from the data specifying the machine learning model, a sequence of operation embeddings, wherein each operation embedding in the sequence characterizes one or more respective operations that are part of performing the processing of the machine learning model; processing the sequence of operation embeddings using a placement recurrent neural network in accordance with first values of a plurality network parameters of the placement recurrent neural network, wherein the placement recurrent neural network is configured to process the sequence of operation embeddings in accordance with the first values to generate a network output defining a placement of the operations characterized by the operation embeddings in the sequence across the plurality of devices; and scheduling the machine learning model for processing by the plurality of hardware devices by placing the operations on the plurality of devices according to the placement defined by the network output. 2. The method of claim 1 , wherein the operations characterized by the operation embeddings are operations that are part of training the machine learning model. 3. The method of claim 1 , wherein the operations characterized by the operation embeddings are operations that are part of performing an inference using the machine learning model. 4. The method of claim 1 , wherein the data specifying the machine learning model is data representing a computational graph having vertices that represent operations and edges that represent data communicated between the operations. 5. The method of claim 4 , wherein generating the sequence of operation embeddings comprises: determining that two or more of the operations represented by vertices in the computational graph are to be co-located on the same device; and in response, generating a single operation embedding that characterizes the two or more operations. 6. The method of claim 1 , wherein generating an operation embedding characterizing a particular operation comprises: generating a type embedding of an operation type of the particular operation; generating an output size embedding that characterizes a size of outputs generated by the particular operation; generating an adjacency embedding that identifies operations that provide input to and receive output generated by the particular operation; and combining the type embedding, the output size embedding, and the adjacency embedding to generate the operation embedding characterizing the particular operation. 7. The method of claim 1 , wherein the recurrent neural network is configured to generate, for each of the operation embeddings in the sequence, a set of scores that includes a respective score for each of the plurality of devices, and wherein processing the sequence of operation embeddings comprises selecting a device for each of the operations using the set of scores for the operation embedding characterizing the operation. 8. The method of claim 7 , wherein selecting the device for each of the operations comprises: selecting the device having the highest score according to the set of scores for the operation embedding characterizing the operation. 9. The method of claim 7 , wherein selecting the device for each of the operations comprises: sampling a device from the plurality of devices according to probabilities defined by the set of scores for the operation embedding characterizing the operation. 10. The method of claim 7 , wherein the recurrent neural network comprises: an encoder recurrent neural network configured to process the sequence of operation embeddings to generate a respective encoder hidden state for each of the operation embeddings; and a decoder neural network configured to, for each of the operation embeddings: receive a decoder input; and process the decoder input and the encoder hidden states to generate the set of scores for the operation embedding. 11. The method of claim 10 , wherein the decoder input for each of the operation embeddings after a first operation embedding in the sequence identifies a device selected for the one or more operations represented by the preceding operation embedding in the sequence. 12. The method of claim 1 , further comprising: determining the first values of the network parameters from initial values of the network parameters by repeatedly performing the following: processing a current sequence of operation embeddings using a placement recurrent neural network in accordance with current values of a plurality of network parameters of the placement recurrent neural network to select one or more placements of the operations across the plurality of devices; for each selected placement: performing the processing of the machine learning model with the operations across the plurality of devices according to the placement, and determining a time required for the processing to complete; and adjusting the current values of the parameters using a reinforcement learning technique that uses a reward derived from the times required for the processing to complete for each of the selected placements. 13. The method of claim 12 , wherein the reinforcement learning technique is a REINFORCE technique. 14. The method of claim 12 , wherein the reinforcement learning technique includes a baseline that is a moving average of the required times. 15. The method of claim 12 , wherein adjusting the current values of the parameters further comprises adjusting the operation embeddings in the current sequence as part of the reinforcement learning technique. 16. A system comprising one or more computers and one or more storage devices storing instructions that, when executed by the one or more computers, cause the one or more computers to perform operations comprising: receiving data specifying a machine learning model to be placed for distributed processing on a plurality of hardware devices; generating, from the data specifying the machine learning model, a sequence of operation embeddings, wherein each operation embedding in the sequence characterizes one or more respective operations that are part of performing the processing of the machine learning model; processing the sequence of operation embeddings using a placement recurrent neural network in accordance with first values of a plurality network parameters of the placement recurrent neural network, wherein the placement recurrent neural network is configured to process the sequence of operation embeddings in accordance with the first values to generate a network output defining a placement of the operations characterized by the operation embeddings in the sequence across the plurality of devices; and scheduling the machine learning model for processing by the plurality of hardware devices by placing the operations on the plurality of devices according to the placement defined by the network output. 17. One or more non-transitory computer storage media storing instructions that, when executed by one or more computers, cause the one or more computers to perform operations comprising: receiving data specifying a machine learning model to be placed for distributed processing on a plurality of hardware devices; generating, from the data specifying the machine learning model, a sequence of operation embeddings, wherein each operation embedding in the sequence characterizes one or more respective operations tha

Assignees

Inventors

Classifications

  • Combinations of networks · CPC title

  • Recurrent networks, e.g. Hopfield networks · CPC title

  • Auto-encoder networks; Encoder-decoder networks · CPC title

  • G06N3/092Primary

    Reinforcement learning · CPC title

  • characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10692003B2 cover?
A method for determining a placement for machine learning model operations across multiple hardware devices is described. The method includes receiving data specifying a machine learning model to be placed for distributed processing on multiple hardware devices; generating, from the data, a sequence of operation embeddings, each operation embedding in the sequence characterizing respective oper…
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G06N3/092. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 23 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).