Domain adaptation of deep neural networks

US11580405B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11580405-B2
Application numberUS-201916727429-A
CountryUS
Kind codeB2
Filing dateDec 26, 2019
Priority dateDec 26, 2019
Publication dateFeb 14, 2023
Grant dateFeb 14, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed herein are system, method, and computer program product embodiments for adapting machine learning models for use in additional applications. For example, feature extraction models are readily available for use in applications such as image detection. These feature extraction models can be used to label inputs (such as images) in conjunction with other deep neural network models. However, in adapting the feature extraction models to these uses, it becomes problematic to improve the quality of their results on target data sets, as these feature extraction models are large and resistant to retraining. Approaches disclosed herein include a transfer layer for providing fast retraining of machine learning models.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer implemented method, comprising: extracting, by one or more computing devices, features from an input by a feature extractor; processing, by the one or more computing devices, the features by transfer layers configured as a deep neural network to produce a transfer output; classifying, by the one or more computing devices, the input with a label by a label predictor implemented as a deep neural network, based on the transfer output; classifying, by the one or more computing devices, the input as either a labeled source input or an unlabeled target input by a domain classifier; and back propagating, by the one or more computing devices, a loss at the domain classifier to the domain classifier and the transfer layers. 2. The computer implemented method of claim 1 , wherein back propagating the loss at the domain classifier to the transfer layers is configured to minimize reliance on labeled source inputs by the transfer layers over successive inputs. 3. The computer implemented method of claim 1 , wherein back propagating the loss at the domain classifier further comprises: back propagating, by the one or more computing devices, the loss at the domain classifier to the domain classifier by adjusting a weight of the domain classifier to reduce the loss; and back propagating, by the one or more computing devices, the loss at the domain classifier to the transfer layers by adjusting a weight of the transfer layers to increase the loss. 4. The computer implemented method of claim 1 , further comprising: back propagating, by the one or more computing devices, a loss at the label predictor to the transfer layers. 5. The computer implemented method of claim 1 , wherein the feature extractor is configured as a deep neural network. 6. The computer implemented method of claim 5 , wherein neurons of the feature extractor are implemented in a fixed configuration. 7. The computer implemented method of claim 5 , wherein the deep neural network of the transfer layers comprises fewer neurons than the deep neural network of the feature extractor. 8. A system, comprising: a memory configured to store operations; and one or more processors configured to perform the operations, the operations comprising: extracting features from an input by a feature extractor, processing the features by transfer layers configured as a deep neural network to produce a transfer output, classifying the input with a label by a label predictor implemented as a deep neural network, based on the transfer output, classifying the input as either a labeled source input or an unlabeled target input by a domain classifier, and back propagating a loss at the domain classifier to the domain classifier and the transfer layers. 9. The system of claim 8 , wherein back propagating the loss at the domain classifier to the transfer layers is configured to minimize reliance on labeled source inputs by the transfer layers over successive inputs. 10. The system of claim 8 , wherein back propagating the loss at the domain classifier further comprises: back propagating the loss at the domain classifier to the domain classifier by adjusting a weight of the domain classifier to reduce the loss; and back propagating the loss at the domain classifier to the transfer layers by adjusting a weight of the transfer layers to increase the loss. 11. The system of claim 8 , the operations further comprising: back propagating a loss at the label predictor to the transfer layers. 12. The system of claim 8 , wherein the feature extractor is configured as a deep neural network. 13. The system of claim 12 , wherein neurons of the feature extractor are implemented in a fixed configuration. 14. The system of claim 12 , wherein the deep neural network of the transfer layers comprises fewer neurons than the deep neural network of the feature extractor. 15. A computer readable storage device having instructions stored thereon, execution of which, by one or more processing devices, causes the one or more processing devices to perform operations comprising: extracting features from an input by a feature extractor; processing the features by transfer layers configured as a deep neural network to produce a transfer output; classifying the input with a label by a label predictor implemented as a deep neural network, based on the transfer output; classifying the input as either a labeled source input or an unlabeled target input by a domain classifier; and back propagating a loss at the domain classifier to the domain classifier and the transfer layers. 16. The computer readable storage device of claim 15 , wherein back propagating the loss at the domain classifier to the transfer layers is configured to minimize reliance on labeled source inputs by the transfer layers over successive inputs. 17. The computer readable storage device of claim 15 , wherein back propagating the loss at the domain classifier further comprises: back propagating the loss at the domain classifier to the domain classifier by adjusting a weight of the domain classifier to reduce the loss; and back propagating the loss at the domain classifier to the transfer layers by adjusting a weight of the transfer layers to increase the loss. 18. The computer readable storage device of claim 15 , further comprising: back propagating a loss at the label predictor to the transfer layers. 19. The computer readable storage device of claim 15 , wherein the feature extractor is configured as a deep neural network. 20. The computer readable storage device of claim 19 , wherein neurons of the feature extractor are implemented in a fixed configuration.

Assignees

Inventors

Classifications

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title

  • Supervised learning · CPC title

  • Adversarial learning · CPC title

  • Transfer learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11580405B2 cover?
Disclosed herein are system, method, and computer program product embodiments for adapting machine learning models for use in additional applications. For example, feature extraction models are readily available for use in applications such as image detection. These feature extraction models can be used to label inputs (such as images) in conjunction with other deep neural network models. Howev…
Who is the assignee on this patent?
Sap Se
What technology area does this patent fall under?
Primary CPC classification G06N3/084. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 14 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).