Training first and second neural network models

US11657265B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11657265-B2
Application numberUS-201816191542-A
CountryUS
Kind codeB2
Filing dateNov 15, 2018
Priority dateNov 20, 2017
Publication dateMay 23, 2023
Grant dateMay 23, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Described herein are systems and methods for training first and second neural network models. A system comprises a memory comprising instruction data representing a set of instructions and a processor configured to communicate with the memory and to execute the set of instructions. The set of instructions, when executed by the processor, cause the processor to set a weight in the second model based on a corresponding weight in the first model, train the second model on a first dataset, wherein the training comprises updating the weight in the second model and adjust the corresponding weight in the first model based on the updated weight in the second model.

First claim

Opening claim text (preview).

The invention claimed is: 1. A system configured for training first and second neural network models, the system comprising: a memory comprising instruction data representing a set of instructions; a processor configured to communicate with the memory and to execute the set of instructions, wherein the set of instructions, when executed by the processor, cause the processor to: set a weight in the second model based on a corresponding weight in the first model; train the second model on a first dataset, wherein the training comprises updating the weight in the second model; and adjust the corresponding weight in the first model based on the updated weight in the second model by applying an increment to a value of the corresponding weight in the first model based on a difference between the corresponding weight in the first model and the weight in the second model. 2. The system as in claim 1 , wherein the weight comprises a weight in one of: an input layer of the second model; and a hidden layer of the second model. 3. The system as in claim 1 , wherein causing the processor to adjust the corresponding weight in the first model comprises causing the processor to: copy a value of the weight from the second model to the corresponding weight in the first model. 4. The system as in claim 1 , wherein causing the processor to adjust the corresponding weight in the first model further comprises causing the processor to: set a weight in an output layer of the first model to an arbitrary value. 5. The system as in claim 1 , wherein causing the processor to adjust the corresponding weight in the first model further comprises causing the processor to: maintain a value of at least one weight in an output layer of the first model at the same value. 6. The system as in claim 1 , wherein causing the processor to set a weight in the second model comprises causing the processor to: copy a value of a weight from one of: an input layer of the first model; and a hidden layer of the first model, to a corresponding weight in the second model. 7. The system as in claim 1 , wherein causing the processor to set a weight in the second model further comprises causing the processor to: set at least one weight in an output layer of the second model to an arbitrary value. 8. The system as in claim 1 , wherein the first model comprises one of: an object detection model; and an object localization model; and wherein the second model comprises the other one of: an object detection model; and an object localization model. 9. The system as in claim 1 , wherein the first model comprises one of: a model configured to produce a single output; and a model configured to produce a plurality of outputs; and wherein the second model comprises the other one of: a model configured to produce a single output; and a model configured to produce a plurality of outputs. 10. The system as in claim 1 , wherein the set of instructions, when executed by the processor, further cause the processor to: adjust a weight in one of: the first model; and the second model; in response to further training of the other one of: the first model; and the second model. 11. The system as in claim 10 , wherein the set of instructions, when executed by the processor, cause the processor to repeat the step of adjusting a weight, until one or more of the following criteria are met: the first model and/or the second model reach a threshold accuracy level; the magnitude of an adjustment falls below a threshold magnitude; said weight in the first model and its corresponding weight in the second model converge towards one another within a predefined threshold; and a loss associated with the first model and/or a loss associated with the second model changes by less than a threshold amount between subsequent adjustments. 12. The system as in claim 1 , wherein the first model is trained on a second dataset, the first dataset comprising less data than the second dataset, wherein the size of the second dataset alone is insufficient to train the second model to a predefined accuracy with arbitrarily initiated weights. 13. A computer implemented method of training first and second neural network models, the method comprising: setting a weight in the second model based on a corresponding weight in the first model; training the second model on a first dataset, wherein the training comprises updating the weight in the second model; and adjusting the corresponding weight in the first model based on the updated weight in the second model, wherein the first model is trained on a second dataset, the first dataset comprising less data than the second dataset, wherein a size of the second dataset alone is insufficient to train the second model to a predefined accuracy with arbitrarily initiated weights. 14. A non-transitory computer readable medium comprising computer readable code embodied therein, the computer readable code being configured such that, on execution by a computer or processor, the computer or processor: sets a weight in a second model based on a corresponding weight in a first model; trains the second model on a first dataset, wherein the training comprises updating the weight in the second model; and adjusts the corresponding weight in the first model based on the updated weight in the second model, wherein the first model is trained on a second dataset, the first dataset comprising less data than the second dataset, wherein a size of the second dataset alone is insufficient to train the second model to a predefined accuracy with arbitrarily initiated weights. 15. The system as in claim 1 , wherein the processor trains one of the first or second neural network models to detect a presence of a particular object in an image, and the processor trains another of the first or second neural network models to measure a length of a particular type of object in an image. 16. The system as in claim 1 , wherein at least one of the first model and second model is a partially trained model. 17. The system as in claim 1 , wherein the first model is a partially trained model. 18. The system as in claim 1 , wherein both the first and the second models are partially trained models, and a model of the first and the second models is trained more than a second of the first and the second models. 19. The system as in claim 1 , wherein the corresponding weight in the first model may be adjusted a percentage of the difference between the corresponding weight in the first model and the weight in the second model. 20. The system as in claim 1 , wherein the second dataset comprises medical images annotated with x,y coordinates of a center of a bounding box drawn around tissue of interest.

Assignees

Inventors

Classifications

  • G06N3/08Primary

    Learning methods · CPC title

  • Supervised learning · CPC title

  • Transfer learning · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Combinations of networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11657265B2 cover?
Described herein are systems and methods for training first and second neural network models. A system comprises a memory comprising instruction data representing a set of instructions and a processor configured to communicate with the memory and to execute the set of instructions. The set of instructions, when executed by the processor, cause the processor to set a weight in the second model b…
Who is the assignee on this patent?
Koninklijke Philips Nv
What technology area does this patent fall under?
Primary CPC classification G06N3/08. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 23 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 11 related publications on this page (citations in our corpus or others sharing the same primary CPC).