Automated generation of machine learning models
US-11348032-B1 · May 31, 2022 · US
US11775812B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11775812-B2 |
| Application number | US-201916379704-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 9, 2019 |
| Priority date | Nov 30, 2018 |
| Publication date | Oct 3, 2023 |
| Grant date | Oct 3, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods, devices, and computer-readable media for multi-task based lifelong learning. A method for lifelong learning includes identifying a new task for a machine learning model to perform. The machine learning model trained to perform an existing task. The method includes adaptively training a network architecture of the machine learning model to generate an adapted machine learning model based on incorporating inherent correlations between the new task and the existing task. The method further includes using the adapted machine learning model to perform both the existing task and the new task.
Opening claim text (preview).
What is claimed is: 1. A method for lifelong learning, the method comprising: identifying a new task for a machine learning model to perform, the machine learning model trained to perform an existing task; adaptively training a network architecture of the machine learning model to generate an adapted machine learning model based on incorporating inherent correlations between the new task and the existing task, wherein adaptively training the network architecture includes: generating a plurality of child network architectures, wherein each of the plurality of child network architectures is expanded from a size of the network architecture by at least one of: adding one or more new layers to the network architecture or expanding one or more existing layers of the network architecture; and determining an optimal child network architecture from the plurality of child network architectures for the adapted machine learning model; and using the adapted machine learning model to perform both the existing task and the new task. 2. The method of claim 1 , wherein, for each of the plurality of child network architectures, the size of the network architecture is expanded using AutoML. 3. The method of claim 2 , wherein expanding the size of the network architecture using AutoML comprises at least one of: using a deeper operator to add the one or more new layers to the network architecture; and using a wider operator to expand the one or more existing layers of the network architecture. 4. The method of claim 1 , further comprising: identifying the one or more new layers as at least one task-specific layer for the new task. 5. The method of claim 1 , further comprising: compressing the optimal child network architecture to reduce a size of the optimal child network architecture; wherein the size of the optimal child network architecture is not compressed smaller than the size of the network architecture. 6. The method of claim 1 , wherein the machine learning model is a compressed model. 7. The method of claim 1 , wherein adaptively training the network architecture further comprises: training the machine learning model to perform the new task using training data for the new task; and compressing the optimal child network architecture of the trained machine learning model using the training data for the new task. 8. An electronic device for lifelong learning, the electronic device comprising: a memory configured to store a machine learning model trained to perform an existing task; and a processor operably connected to the memory, the processor configured to: identify a new task for the machine learning model to perform; adaptively train a network architecture of the machine learning model to generate an adapted machine learning model based on incorporating inherent correlations between the new task and the existing task, wherein, to adaptively train the network architecture, the processor is configured to: generate a plurality of child network architectures, wherein each of the plurality of child network architectures is expanded from a size of the network architecture by at least one of: adding one or more new layers to the network architecture or expanding one or more existing layers of the network architecture; and determine an optimal child network architecture from the plurality of child network architectures for the adapted machine learning model; and use the adapted machine learning model to perform both the existing task and the new task. 9. The electronic device of claim 8 , wherein, for each of the plurality of child network architectures, the processor is configured to expand the size of the network architecture using AutoML. 10. The electronic device of claim 9 , wherein, to expand the size of the network architecture using AutoML, the processor is configured to at least one of: use a deeper operator to add the one or more new layers to the network architecture; and use a wider operator to expand the one or more existing layers of the network architecture. 11. The electronic device of claim 8 , wherein the processor is further configured to identify the one or more new layers as at least one task-specific layer for the new task. 12. The electronic device of claim 8 , wherein: the processor is further configured to compress the optimal child network architecture to reduce a size of the optimal child network architecture; and the size of the optimal child network architecture is not compressed smaller than the size of the network architecture. 13. The electronic device of claim 8 , wherein the machine learning model is a compressed model. 14. The electronic device of claim 8 , wherein, to adaptively train the network architecture, the processor is further configured to: train the machine learning model to perform the new task using training data for the new task; and compress the optimal child network architecture of the trained machine learning model using the training data for the new task. 15. A non-transitory, computer-readable medium comprising program code for lifelong learning that, when executed by a processor of an electronic device, causes the electronic device to: identify a new task for a machine learning model to perform, the machine learning model trained to perform an existing task; adaptively train a network architecture of the machine learning model to generate an adapted machine learning model based on incorporating inherent correlations between the new task and the existing task; and use the adapted machine learning model to perform both the existing task and the new task; wherein the program code that, when executed by the processor, causes the electronic device to adaptively train the network architecture comprises program code that, when executed by the processor, causes the electronic device to: generate a plurality of child network architectures, wherein each of the plurality of child network architectures is expanded from a size of the network architecture by at least one of: adding one or more new layers to the network architecture or expanding one or more existing layers of the network architecture; and determine an optimal child network architecture from the plurality of child network architectures for the adapted machine learning model. 16. The non-transitory, computer-readable medium of claim 15 , wherein the program code that, when executed by the processor, causes the electronic device to generate the plurality of child network architectures comprises program code that, when executed by the processor, causes the electronic device to, for each of the plurality of child network architectures, expand the size of the network architecture using AutoML. 17. The non-transitory, computer-readable medium of claim 16 , wherein the program code that, when executed by the processor, causes the electronic device to expand the size of the network architecture using AutoML comprises program code that, when executed by the processor, causes the electronic device to at least one of: use a deeper operator to add the one or more new layers to the network architecture; and use a wider operator to expand the one or more existing layers of the network architecture. 18. The non-transitory, computer-readable medium of claim 15 , further comprising program code that, when executed by the processor, causes the electronic device to identify the one or more new layers as at least one task-specific layer for the new task. 19. The non-transitory, computer-readable medium of claim 15 , fu
Related publications grouped by family.
Answers are generated from the same data shown on this page.