Evolution of architectures for multitask neural networks

US11030529B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11030529-B2
Application numberUS-201816219286-A
CountryUS
Kind codeB2
Filing dateDec 13, 2018
Priority dateDec 13, 2017
Publication dateJun 8, 2021
Grant dateJun 8, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Evolution and coevolution of neural networks via multitask learning is described. The foundation is (1) the original soft ordering, which uses a fixed architecture for the modules and a fixed routing (i.e. network topology) that is shared among all tasks. This architecture is then extended in two ways with CoDeepNEAT: (2) by coevolving the module architectures (CM), and (3) by coevolving both the module architectures and a single shared routing for all tasks using (CMSR). An alternative evolutionary process (4) keeps the module architecture fixed, but evolves a separate routing for each task during training (CTR). Finally, approaches (2) and (4) are combined into (5), where both modules and task routing are coevolved (CMTR).

First claim

Opening claim text (preview).

The invention claimed is: 1. A processor implemented method for evolving task-specific topologies in a multitask architecture comprising: establishing a set of shared modules which are shared among each task-specific topology; initializing the shared modules { k } k=1 K with random weights; initializing a champion individual module routing scheme for each task (t), wherein the ith individual for the tth task is represented by a tuple (E ti , G ti ,D ti ), and further wherein E ti is an encoder, G ti is a DAG, which specifies the individual module routing scheme, and D ti is a decoder, with E ti and D ti initialized with random weights; for each champion individual (E ti , G ti , D ti ), generating a challenger (E t2 , G t2 , D t2 ) by mutating the tth champion in accordance with a predetermined mutation subprocess; jointly training each champion and challenger for M iterations on a training set of data; evaluating each champion and challenger on a validation set of data to determine an accuracy fitness for each individual champion and challenger for its predetermined task; if a challenger has higher accuracy fitness than a corresponding champion, then the champion is replaced wherein (E ti , G ti , D ti )=(E t2 , G t2 , D t2 ); calculating an average accuracy fitness across all champions for tasks in the multitask architecture; and checkpointing the shared modules when the average accuracy is best achieved. 2. The process according to claim 1 , wherein the predetermined mutation subprocess of includes: (i) start as a copy of the champion, including learned weights, wherein (E t2 , G t2 , D t2 ):=(E ti , G ti , D ti ); (ii) randomly select a pair of nodes (u, v) from G t2 such that v is an ancestor of u; (iii) randomly select a module M k from the shared modules; (iv) add a new node w to G t2 with M k as its function; (v) add new edges (u,w) and (w,v) to G t2 ; (vi) set the scalar weight of (w,v) such that its value after softmax is some α∈(0,1). 3. The process according to claim 1 , wherein the training set of data and the validation set of data are disjointed. 4. The process according to claim 1 , wherein G ti is initialized in accordance with a graph initialization policy. 5. The process according to claim 1 , wherein a model for an individual is then given by y t =( ti ∘ ( G ti ,{ k } k=1 K )∘ε ti )( x t ), where R indicates application of the shared modules M k based on the DAG G ti . 6. The process according to claim 5 , wherein E ti and D ti are selected from a grouping consisting of neural network functions that are compatible with the set of shared modules. 7. The process according to claim 6 , wherein each E ti is an identity transformation layer, and D ti , is a fully connected classification layer. 8. The process according to claim 1 , wherein G ti is a DAG whose single source node represents the input layer for that task (t), and whose single sink node represents the output layer and further wherein all other nodes either point to a module M k to be applied at that location, or to a parameterless adapter layer for ensuring adjacent modules are technically compatible. 9. A processor implemented method for evolving task-specific topologies and shared modules in a multitask architecture comprising: initializing a population of modules and randomly selecting modules (m) from each species in the population and grouping selected modules from each species (k) together into sets of modules M k ; providing the sets of modules M k to a task-specific routing evolution subprocess, wherein the subprocess: establishes a set of shared modules which are shared among each task-specific topology; initializes a champion individual module routing scheme for each task (t), wherein the ith individual for the tth task is represented by a tuple (E ti , G ti , D ti ), and further wherein E ti is an encoder, G ti is a DAG, which specifies the individual module routing scheme, and D ti is a decoder, with E ti and D ti initialized with random weights; for each champion individual (E ti , G ti , D ti ), generating a challenger (E t2 , G t2 , D t2 ) by mutating the tth champion in accordance with a predetermined mutation subprocess; jointly training each champion and challenger for M iterations on a training set of data; evaluating each champion and challenger on a validation set of data to determine an accuracy fitness for each individual champion and challenger for its predetermined task; if a challenger has higher accuracy fitness than a corresponding champion, then the champion is replaced wherein (E ti , G ti , D ti )=(E t2 , G t2 , D t2 ); calculating an average accuracy fitness across all champions for tasks in the multitask architecture; checkpointing the shared modules when the average accuracy fitness is best achieved; attributing the best achieved average accuracy fitness determined from the task-specific routing evolution subprocess to each module (m) as part of a module evolution subprocess which further includes applying evolutionary operators to evolve modules (m). 10. The process according to claim 9 , wherein the predetermined mutation subprocess of the task-specific routing evolution subprocess includes: (i) start as a copy of the champion, including learned weights, wherein (E t2 , G t2 , D t2 ):=(E ti , G ti , D ti ); (ii) randomly select a pair of nodes (u,v) from G t2 such that v is an ancestor of u; (iii) randomly select a module M k from the shared modules; (iv) add a new node w to G t2 with M k as its function; (v) add new edges (u,w) and (w,v) to G t2 ; (vi) set the scalar weight of (w,v) such that its value after softmax is some α∈(0,1).

Assignees

Inventors

Classifications

  • Activation functions · CPC title

  • Combinations of networks · CPC title

  • Recurrent networks, e.g. Hopfield networks · CPC title

  • Architecture, e.g. interconnection topology · CPC title

  • modifying the architecture, e.g. adding, deleting or silencing nodes or connections · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11030529B2 cover?
Evolution and coevolution of neural networks via multitask learning is described. The foundation is (1) the original soft ordering, which uses a fixed architecture for the modules and a fixed routing (i.e. network topology) that is shared among all tasks. This architecture is then extended in two ways with CoDeepNEAT: (2) by coevolving the module architectures (CM), and (3) by coevolving both t…
Who is the assignee on this patent?
Cognizant Tech Solutions U S Corporation
What technology area does this patent fall under?
Primary CPC classification G06N3/086. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 08 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).