Generating larger neural networks

US10699191B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10699191-B2
Application numberUS-201615349901-A
CountryUS
Kind codeB2
Filing dateNov 11, 2016
Priority dateNov 12, 2015
Publication dateJun 30, 2020
Grant dateJun 30, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

This specification describes methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a larger neural network from a smaller neural network. One of the described methods includes obtaining data specifying an original neural network and generating a larger neural network from the original neural network The larger neural network has a larger neural network structure than the original neural network structure. The values of the parameters of the original neural network units and the additional neural network units are initialized so that the larger neural network generates the same outputs from the same inputs as the original neural network and the larger neural network is trained to determine trained values of the parameters of the original neural network units and the additional neural network units from the initialized values.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of generating a larger neural network from a smaller neural network, the method comprising: obtaining data specifying an original neural network, the original neural network being configured to generate neural network outputs from neural network inputs, the original neural network having an original neural network structure comprising a plurality of original neural network units, each original neural network unit having respective parameters, and each of the parameters of each of the original neural network units having a respective original value; generating a larger neural network from the original neural network, the larger neural network having a larger neural network structure comprising: (i) the plurality of original neural network units, and (ii) a plurality of additional neural network units not in the original neural network structure, each additional neural network unit having respective parameters; initializing values of the parameters of the original neural network units and the additional neural network units by setting the values of the parameters of the original neural network units and the additional neural network units to values that result in the larger neural network generating, for any particular neural network input, the same neural network output for the particular neural network input as would be generated by the original neural network by processing the particular neural network input in accordance with the original parameter values for the original neural network units; and training the larger neural network to determine trained values of the parameters of the original neural network units and the additional neural network units from the initialized values. 2. The method of claim 1 , further comprising: training the original neural network to determine the original values of the parameters of the original neural network. 3. The method of claim 2 , wherein the original neural network structure comprises a first original neural network layer having a first number of original units, and wherein generating the larger neural network comprises: adding a plurality of additional neural network units to the first original neural network layer to generate a larger neural network layer. 4. The method of claim 3 , wherein initializing values of the parameters of the original neural network units and the additional neural network units so that the larger neural network generates the same neural network outputs from the same neural network inputs as the original neural network comprises: initializing the values of the parameters of the original neural network units in the larger neural network layer to the respective original values for the parameters; and for each additional neural network unit in the larger neural network layer: selecting an original neural network unit in the original neural network layer, and initializing the values of the parameters of the additional neural network unit to be the same as the respective original values for the selected original neural network unit. 5. The method of claim 4 , wherein selecting an original neural network unit in the larger neural network layer comprises: randomly selecting an original neural network unit from the original neural network units in the original neural network layer. 6. The method of claim 4 , wherein: in the original neural network structure, a second original neural network layer is configured to receive as input outputs generated by the first original neural network layer; in the larger neural network structure, the second original neural network layer is configured to receive as input outputs generated by the larger neural network layer; and initializing values of the parameters of the original neural network units and the additional neural network units so that the larger neural network generates the same neural network outputs from the same neural network inputs as the original neural network comprises: initializing the values of the parameters of the original neural network units in the second original neural network layer so that, for a given neural network input, the second neural network layer generates the same output in both the original neural network structure and the larger neural network structure. 7. The method of claim 6 , wherein the original neural network structure comprises a third original neural network layer configured to receive a third original layer input and generate a third original layer output from the third layer input, and wherein generating the larger neural network comprises: replacing the third original neural network layer with a first additional neural network layer having additional neural network units and a second additional neural network layer having additional neural network units, wherein: the first additional neural network layer is configured to receive the third original layer input and generate a first additional layer output from the third original layer input, and the second additional neural network layer is configured to receive the first additional layer output and generate a second additional layer output from the first additional layer output. 8. The method of claim 7 , wherein initializing values of the parameters of the original neural network units and the additional neural network units so that the larger neural network generates the same neural network outputs from the same neural network inputs as the original neural network comprises: initializing the values of the parameters of the additional neural network units in the first additional neural network layer and in the second additional neural network layer so that, for the same neural network input, the second additional layer output is the same as the third original layer output. 9. The method of claim 7 , wherein initializing values of the parameters of the original neural network units and the additional neural network units so that the larger neural network generates the same neural network outputs from the same neural network inputs as the original neural network comprises: initializing the values of the parameters of the additional neural network units in the first additional neural network layer using the respective original values for the parameters of the original neural network units in the third original neural network layer. 10. A system comprising one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising: obtaining data specifying an original neural network, the original neural network being configured to generate neural network outputs from neural network inputs, the original neural network having an original neural network structure comprising a plurality of original neural network units, each original neural network unit having respective parameters, and each of the parameters of each of the original neural network units having a respective original value; generating a larger neural network from the original neural network, the larger neural network having a larger neural network structure comprising: (i) the plurality of original neural network units, and (ii) a plurality of additional neural network units not in the original neural network structure, each additional neural network unit having respective parameters; initializing values of the parameters of the original neural network units and the additional neural network units by setting the values of the parameters of the original neural network units and the additional neural network units to values that result in the larger neural network generat

Assignees

Inventors

Classifications

  • Feedforward networks · CPC title

  • Supervised learning · CPC title

  • G06N3/082Primary

    modifying the architecture, e.g. adding, deleting or silencing nodes or connections · CPC title

  • G06N3/045Primary

    Combinations of networks · CPC title

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10699191B2 cover?
This specification describes methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a larger neural network from a smaller neural network. One of the described methods includes obtaining data specifying an original neural network and generating a larger neural network from the original neural network The larger neural network has a larg…
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G06N3/082. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 30 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).