Granular neural network architecture search over low-level primitives
US-2024428071-A1 · Dec 26, 2024 · US
US2023095268A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2023095268-A1 |
| Application number | US-202217829403-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jun 1, 2022 |
| Priority date | Sep 24, 2021 |
| Publication date | Mar 30, 2023 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A non-transitory computer-readable storage medium storing a machine learning program that causes at least one computer to execute a process, the process includes acquiring a first training rate of a first layer that is selected to stop training among layers included in a machine learning model during training of the machine learning model; setting a first time period to stop training the first layer based on the training rate; and training the first layer with controlling the training rate up to the first time period.
Opening claim text (preview).
What is claimed is: 1 . A non-transitory computer-readable storage medium storing a machine learning program that causes at least one computer to execute a process, the process comprising: acquiring a first training rate of a first layer that is selected to stop training among layers included in a machine learning model during training of the machine learning model; setting a first time period to stop training the first layer based on the training rate; and training the first layer with controlling the training rate up to the first time period. 2 . The non-transitory computer-readable storage medium according to claim 1 , wherein the setting includes setting the first time period based on the first training rate of a previous iteration of a processing iteration. 3 . The non-transitory computer-readable storage medium according to claim 1 , wherein the setting includes setting the first time period based on a change of the first training rate during training of the machine learning model. 4 . The non-transitory computer-readable storage medium according to claim 1 , wherein the acquiring includes acquiring a second training rate of a block that is selected to stop training among blocks each of which is a collection of a plurality of layers of the layers, the setting includes setting a second time period to stop training the plurality of layers based on the second training rate, and the training includes training the plurality of layers with controlling the second training rate up to the second time period. 5 . The non-transitory computer-readable storage medium according to claim 4 , wherein the acquiring includes acquiring a plurality of third training rates of each of the plurality layers included in the block, and the setting includes setting an average of a plurality of third time periods set based on the plurality of third training rates as the second time period. 6 . The non-transitory computer-readable storage medium according to claim 4 , wherein the acquiring includes acquiring a plurality of third training rates of each of the plurality layers included in the block, and the setting includes setting the second time period based on an average of the plurality of third training rates. 7 . A machine learning method for a computer to execute a process comprising: acquiring a first training rate of a first layer that is selected to stop training among layers included in a machine learning model during training of the machine learning model; setting a first time period to stop training the first layer based on the training rate; and training the first layer with controlling the training rate up to the first time period. 8 . The machine learning method according to claim 7 , wherein the setting includes setting the first time period based on the first training rate of a previous iteration of a processing iteration. 9 . The machine learning method according to claim 7 , wherein the setting includes setting the first time period based on a change of the first training rate during training of the machine learning model. 10 . The machine learning method according to claim 7 , wherein the acquiring includes acquiring a second training rate of a block that is selected to stop training among blocks each of which is a collection of a plurality of layers of the layers, the setting includes setting a second time period to stop training the plurality of layers based on the second training rate, and the training includes training the plurality of layers with controlling the second training rate up to the second time period. 11 . The machine learning method according to claim 10 , wherein the acquiring includes acquiring a plurality of third training rates of each of the plurality layers included in the block, and the setting includes setting an average of a plurality of third time periods set based on the plurality of third training rates as the second time period. 12 . The machine learning method according to claim 10 , wherein the acquiring includes acquiring a plurality of third training rates of each of the plurality layers included in the block, and the setting includes setting the second time period based on an average of the plurality of third training rates. 13 . An information processing apparatus comprising: one or more memories; and one or more processors coupled to the one or more memories and the one or more processors configured to: acquire a first training rate of a first layer that is selected to stop training among layers included in a machine learning model during training of the machine learning model, set a first time period to stop training the first layer based on the training rate, and train the first layer with controlling the training rate up to the first time period. 14 . The information processing apparatus according to claim 13 , wherein the one or more processors are further configured to set the first time period based on the first training rate of a previous iteration of a processing iteration. 15 . The information processing apparatus according to claim 13 , wherein the one or more processors are further configured to set the first time period based on a change of the first training rate during training of the machine learning model. 16 . The information processing apparatus according to claim 13 , wherein the one or more processors are further configured to: acquire a second training rate of a block that is selected to stop training among blocks each of which is a collection of a plurality of layers of the layers, set a second time period to stop training the plurality of layers based on the second training rate, and train the plurality of layers with controlling the second training rate up to the second time period. 17 . The information processing apparatus according to claim 16 , wherein the one or more processors are further configured to: acquire a plurality of third training rates of each of the plurality layers included in the block, and set an average of a plurality of third time periods set based on the plurality of third training rates as the second time period. 18 . The information processing apparatus according to claim 16 , wherein the one or more processors are further configured to: acquire a plurality of third training rates of each of the plurality layers included in the block, and set the second time period based on an average of the plurality of third training rates.
Related publications grouped by family.
Answers are generated from the same data shown on this page.