What technology area does this patent fall under?

Primary CPC classification G06N3/044. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Jul 29 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Neural architecture search

US2021232929A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2021232929-A1
Application number	US-202117232803-A
Country	US
Kind code	A1
Filing date	Apr 16, 2021
Priority date	Oct 27, 2017
Publication date	Jul 29, 2021
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining neural network architectures. One of the methods includes generating, using a controller neural network, a batch of output sequences, each output sequence in the batch specifying a respective subset of a plurality of components of a large neural network that should be active during the processing of inputs by the large neural network; for each output sequence in the batch: determining a performance metric of the large neural network on the particular neural network task (i) in accordance with current values of the large network parameters and (ii) with only the subset of components specified by the output sequences active; and using the performance metrics for the output sequences in the batch to adjust the current values of the controller parameters of the controller neural network.

First claim

Opening claim text (preview).

What is claimed is: 1 . (canceled) 2 . A method of determining an architecture for a neural network for performing a particular neural network task, the method comprising: generating, in accordance with current values of a plurality of controller parameters, a batch of output sequences, each output sequence in the batch specifying a respective subset of a plurality of components of a large neural network that should be active during the processing of one or more inputs by the large neural network, wherein the large neural network has a plurality of large network parameters; for each output sequence in the batch: determining a performance metric of the large neural network on the particular neural network task (i) in accordance with current values of the large network parameters and (ii) with only the components specified by the output sequence active; and using the performance metrics for the output sequences in the batch to adjust the current values of the controller parameters. 3 . The method of claim 2 , further comprising: generating, in accordance with the adjusted values of the controller parameters, a new output sequence; and training the large neural network with only the components specified by the new output sequence active on training data to determine adjusted values of the large network parameters. 4 . The method of claim 2 , wherein using the performance metrics for the output sequences in the batch to adjust the current values of the controller parameters comprises: adjusting the current values of the controller parameters to cause generated output sequences to have increased performance metrics using a reinforcement learning technique. 5 . The method of claim 4 , wherein the reinforcement learning technique is a policy gradient technique. 6 . The method of claim 5 , wherein the reinforcement learning technique is a REINFORCE technique. 7 . The method of claim 2 , wherein the large neural network comprises a plurality of layers. 8 . The method claim 2 , wherein the current values of the large network parameters are fixed while determining the performance of the large neural network. 9 . The method of claim 2 , wherein each output sequence comprises respective outputs at each of a plurality of time steps, wherein each time step corresponds to a respective node in a directed acyclic graph (DAG) that represents the large neural network, wherein the DAG comprises a plurality of edges connecting nodes in the DAG, and wherein the output sequence defines, for each node, an input received by the node and a computation performed by the node. 10 . The method of claim 9 , wherein generating the batch of output sequences comprises: generating, for each particular node of a plurality of nodes in the DAG, at a first time step corresponding to the node, a probability distribution over nodes that are connected to the particular node by an incoming edge in the DAG. 11 . The method of claim 9 wherein generating the batch of output sequences comprises: generating, for each particular node of a plurality of nodes in the DAG, at a first time step corresponding to the node, a respective independent probability for each node that is connected to the particular node by an incoming edge in the DAG that defines a likelihood that the edge will be designated as active. 12 . The method of claim 10 , for each particular node of the plurality of nodes in the DAG, at a second time step corresponding to the node, generating a probability distribution over possible computations performed by the particular node. 13 . The method of claim 2 , wherein the large neural network is a recurrent neural network. 14 . The method of claim 2 , wherein the large neural network is a convolutional neural network. 15 . The method of claim 2 , further comprising: generating, in accordance with the adjusted values of the controller parameters, a final output sequence that defines a final set of components. 16 . The method of claim 15 , performing the particular neural network task for received network inputs by processing the received network inputs with only the final set of components active. 17 . A system comprising one or more computers and one or more storage devices storing instructions that when executed by the one or more computers cause the one or more computers to perform operations for determining an architecture for a neural network for performing a particular neural network task, the operations comprising: generating, in accordance with current values of a plurality of controller parameters, a batch of output sequences, each output sequence in the batch specifying a respective subset of a plurality of components of a large neural network that should be active during the processing of one or more inputs by the large neural network, wherein the large neural network has a plurality of large network parameters; for each output sequence in the batch: determining a performance metric of the large neural network on the particular neural network task (i) in accordance with current values of the large network parameters and (ii) with only the components specified by the output sequence active; and using the performance metrics for the output sequences in the batch to adjust the current values of the controller parameters. 18 . One or more non-transitory computer-readable storage media storing instructions that when executed by one or more computers cause the one or more computers to perform operations for determining an architecture for a neural network for performing a particular neural network task, the operations comprising: generating, in accordance with current values of a plurality of controller parameters, a batch of output sequences, each output sequence in the batch specifying a respective subset of a plurality of components of a large neural network that should be active during the processing of one or more inputs by the large neural network, wherein the large neural network has a plurality of large network parameters; for each output sequence in the batch: determining a performance metric of the large neural network on the particular neural network task (i) in accordance with current values of the large network parameters and (ii) with only the components specified by the output sequence active; and using the performance metrics for the output sequences in the batch to adjust the current values of the controller parameters. 19 . The system of claim 17 , the operations further comprising: generating, in accordance with the adjusted values of the controller parameters, a new output sequence; and training the large neural network with only the components specified by the new output sequence active on training data to determine adjusted values of the large network parameters.

Assignees

Google Llc

Inventors

Classifications

G06N3/044Primary
Recurrent networks, e.g. Hopfield networks · CPC title
G06N3/045
Combinations of networks · CPC title
G06N3/047
Probabilistic or stochastic networks · CPC title
G06N3/092
Reinforcement learning · CPC title
G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title

Patent family

Related publications grouped by family.

View patent family 64427206

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2021232929A1 cover?: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining neural network architectures. One of the methods includes generating, using a controller neural network, a batch of output sequences, each output sequence in the batch specifying a respective subset of a plurality of components of a large neural network that should be active during th…
Who is the assignee on this patent?: Google Llc
What technology area does this patent fall under?: Primary CPC classification G06N3/044. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Jul 29 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).