Transfer learning with augmented neural networks

US12585933B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12585933-B2
Application numberUS-201916561896-A
CountryUS
Kind codeB2
Filing dateSep 5, 2019
Priority dateSep 5, 2019
Publication dateMar 24, 2026
Grant dateMar 24, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A pretrained model is selected to operate in an augmented model configuration with a submodel. The submodel is trained using training data corresponding to a second domain, whereas the pretrained model is trained to operate on data of a first domain. The pretrained model is augmented, to form the augmented model configuration, with the submodel, by combining a first feature map being output from a layer in the pretrained model with a second feature map being output from a layer in the submodel. The combining forms a combined feature map. The combined feature map is input into a different layer in the submodel.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method comprising: selecting a pretrained neural network model to operate in an augmented neural network model configuration with a neural network submodel, wherein the pretrained neural network model is pretrained to classify a first type of object in a first domain; training, using a processor and a memory, the neural network submodel using training data corresponding to a second domain, wherein the training comprises training the neural network submodel to classify a second type of object in the second domain; and augmenting, to form the augmented neural network model configuration, the pretrained neural network model with the neural network submodel, the augmenting comprising: rearranging a subset of channels from an output of a layer in the pretrained neural network model, the subset including those channels whose channel selection parameters cause those channels to have a greater than a threshold weight, the rearranging further applying a first weight vector to the subset of channels according to a relevance criterion, the subset including the first channel as a highest weighted channel; combining, to form a combined feature map, a first feature map being output from a layer in the pretrained neural network model with a second feature map being output from a layer in the neural network submodel; and inputting the combined feature map into a different layer in the neural network submodel. 2 . The method of claim 1 , further comprising: concatenating, as a part of the combining, the first feature map and the second feature map. 3 . The method of claim 1 , further comprising: adjusting a dimensionality of an original feature map, the original feature map being an original output from the layer in the pretrained neural network model, the adjusting resulting in the first feature map used in the combining. 4 . The method of claim 3 , wherein the adjusting comprises reducing the dimensionality of the original feature map. 5 . The method of claim 4 , wherein the reducing comprises applying a 1-by-1 convolution to the original feature map. 6 . The method of claim 1 , wherein the neural network submodel is smaller than the pretrained neural network model according to at least one factor selected from a set of factors comprising (i) a total number of nodes in the neural network submodel and (ii) a total number of layers in the neural network submodel. 7 . The method of claim 1 , wherein the neural network submodel is smaller than the pretrained neural network model according to a total number of model parameters. 8 . A method comprising: selecting a pretrained neural network model to operate in an augmented model configuration with a neural network submodel, wherein the pretrained neural network model is pretrained to classify a first type of object in a first domain; training, using a processor and a memory, the neural network submodel using training data corresponding to a second domain, wherein the training comprises training the neural network submodel to classify a second type of object in the second domain; and augmenting, to form the augmented model configuration, the pretrained neural network model with the neural network submodel, the augmenting comprising: rearranging a subset of channels from an output of a layer in the pretrained neural network model, the subset including those channels whose channel selection parameters cause those channels to have a greater than a threshold weight, the rearranging further applying a first weight vector to the subset of channels according to a relevance criterion, the subset including the first channel as a highest weighted channel; adjusting an attention value of a channel in a first feature map being output from a layer in the pretrained neural network model, wherein the adjusting causes a first feature matrix of the channel in the first feature map to have a greater weight relative to a second feature matrix of a different channel in the first feature map; combining, to form a combined feature map, a first feature matrix of the channel in the first feature map with a second feature map being output from a layer in the neural network submodel; and inputting the combined feature map into a different layer in the neural network submodel. 9 . The method of claim 8 , further comprising: adjusting a second attention value of a second channel in a second feature map being output from a layer in the neural network submodel, wherein the adjusting the second attention value causes a first feature matrix of the second channel in the second feature map to have a greater weight relative to a second feature matrix of a second different channel in the second feature map, and wherein the combining combines the first feature matrix of the second channel in the second feature map with the first feature matrix of the channel in the first feature map. 10 . The method of claim 8 , further comprising: applying a scaling factor to a plurality of weighted feature matrices from at least one of the first feature map and the second feature map. 11 . The method of claim 8 , further comprising: applying a channel-wise multiplexing to the combined feature map prior to inputting the combined feature map. 12 . The method of claim 8 , wherein the neural network submodel is smaller than the pretrained neural network model according to at least one factor selected from a set of factors comprising (i) a total number of nodes in the neural network submodel and (ii) a total number of layers in the neural network submodel. 13 . The method of claim 8 , wherein the neural network submodel is smaller than the pretrained neural network model according to a total number of model parameters. 14 . A method comprising: selecting a pretrained model to operate in an augmented model configuration with a submodel, wherein the pretrained model is pretrained to classify a first type of object in a first domain; training, using a processor and a memory, the submodel using training data corresponding to a second domain, wherein the training comprises training the submodel to classify a second type of object in the second domain; and augmenting, to form the augmented model configuration, the pretrained model with the submodel, the augmenting comprising: applying a channel selection parameter to a first channel in a first feature map being output from a layer in the pretrained model, wherein the applying causes a first feature matrix of the first channel in the first feature map to have a greater weight relative to a second feature matrix of a different channel in the first feature map; rearranging a subset of channels from the output of the layer in the pretrained model, the subset including those channels whose channel selection parameters cause those channels to have a greater than a threshold weight, the rearranging further applying a first weight vector to the subset of channels according to a relevance criterion, the subset including the first channel as a highest weighted channel; combining, to form a combined feature map, a first feature matrix of the first channel in the first feature map with a second feature map being output from a layer in the submodel; and inputting the combined feature map into a different layer in the submodel. 15 . The method of claim 14 , further comprising: applying a second channel selection parameter to a second channel in the second feature map, wherein the applying the second channel selection parameter causes a second feature matrix of the second channel in the second feature map to have a greater weight relat

Assignees

Inventors

Classifications

  • Interfaces, programming languages or software development kits, e.g. for simulating neural networks · CPC title

  • Architecture, e.g. interconnection topology · CPC title

  • Transfer learning · CPC title

  • Supervised learning · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12585933B2 cover?
A pretrained model is selected to operate in an augmented model configuration with a submodel. The submodel is trained using training data corresponding to a second domain, whereas the pretrained model is trained to operate on data of a first domain. The pretrained model is augmented, to form the augmented model configuration, with the submodel, by combining a first feature map being output fro…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06N3/08. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 24 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).