Out-of-domain encoder training

US11645514B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11645514-B2
Application numberUS-201916530457-A
CountryUS
Kind codeB2
Filing dateAug 2, 2019
Priority dateAug 2, 2019
Publication dateMay 9, 2023
Grant dateMay 9, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer-implemented method includes using an embedding network to generate prototypical vectors. Each prototypical vector is based on a corresponding label associated with a first domain. The computer-implemented method also includes using the embedding network to generate an in-domain test vector based on at least one data sample from a particular label associated with the first domain and using the embedding network to generate an out-of-domain test vector based on at least one other data sample associated with a different domain. The computer-implemented method also includes comparing the prototypical vectors to the in-domain test vector to generate in-domain comparison values and comparing the prototypical vectors to the out-of-domain test vector to generate out-of-domain comparison values. The computer-implemented method also includes modifying, based on the in-domain comparison values and the out-of-domain comparison values, one or more parameters of the embedding network.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method of training an encoder, the computer-implemented method comprising: using an embedding network to generate prototypical vectors, each prototypical vector based on a corresponding label associated with a first domain; using the embedding network to generate an in-domain test vector based on at least one data sample from a particular label associated with the first domain; using the embedding network to generate an out-of-domain test vector based on at least one other data sample associated with a different domain; comparing the prototypical vectors to the in-domain test vector to generate in-domain comparison values; comparing the prototypical vectors to the out-of-domain test vector to generate out-of-domain comparison values; and modifying, based on the in-domain comparison values and the out-of-domain comparison values, one or more parameters of the embedding network to generate one or more modified parameters for the embedding network. 2. The computer-implemented method of claim 1 , wherein modifying the one or more parameters of the embedding network comprises modifying one or more weights of the embedding network. 3. The computer-implemented method of claim 1 , further comprising determining a maximum likelihood of a true label based on the in-domain test vector, the in-domain test vector generated based on a training data sample selected from a label associated with the first domain. 4. The computer-implemented method of claim 3 , wherein the maximum likelihood of the true label is determined based on L in = expa ⁡ ( x i in , s l i in ) ∑ l expa ⁡ ( x i in , s l i in ) . 5. The computer-implemented method of claim 3 , wherein modifying the one or more parameters of the embedding network comprises: selecting particular parameters that minimize a distance between the in-domain test vector and a particular prototypical vector associated with the maximum likelihood of the true label, wherein the particular parameters correspond to the one or more modified parameters. 6. The computer-implemented method of claim 5 , wherein minimizing the distance between the out-of-domain test vector and the particular prototypical vector is based on L gt =max[0, M 2 −max(F(x i in , S i in ))]. 7. The computer-implemented method of claim 1 , wherein modifying the one or more parameters of the embedding network comprises: selecting particular parameters that maximize distances between the out-of-domain test vector and the prototypical vectors, wherein the particular parameters correspond to the one or more modified parameters. 8. The computer-implemented method of claim 7 , wherein maximizing the distances between the out-of-domain test vector and the prototypical vectors is based on L ood =max[0, max(F(x j out , S l in )−M 1 )]. 9. The computer-implemented method of claim 1 , further comprising: randomly selecting a first group of one or more data samples from a first label associated with the first domain; and randomly selecting a second group of one or more data samples from a second label associated with the first domain. 10. The computer-implemented method of claim 9 , further comprising: encoding, using the embedding network, each data sample in the first group of one or more data samples to generate corresponding first sample vectors; and encoding, using the embedding network, each data sample in the second group of one or more data samples to generate corresponding second sample vectors. 11. The computer-implemented method of claim 10 , further comprising: performing an average-pooling operation on the first sample vectors to generate a first prototypical vector; and performing the average-pooling operation on the second sample vectors to generate a second prototypical vector, wherein the prototypical vectors include at least the first prototypical vector and the second prototypical vector. 12. An apparatus comprising: a processor; and a memory coupled to the processor and storing instructions that, when executed by the processor, cause the processor to perform operations comprising: using an embedding network to generate prototypical vectors, each prototypical vector based on a corresponding label associated with a first domain; using the embedding network to generate an in-domain test vector based on at least one data sample from a particular label associated with the first domain; using the embedding network to generate an out-of-domain test vector based on at least one other data sample associated with a different domain; comparing the prototypical vectors to the in-domain test vector to generate in-domain comparison values; comparing the prototypical vectors to the out-of-domain test vector to generate out-of-domain comparison values; and modifying, based on the in-domain comparison values and the out-of-domain comparison values, one or more parameters of the embedding network to generate one or more modified parameters for the embedding network. 13. The apparatus of claim 12 , wherein modifying the one or more parameters of the embedding network comprises modifying one or more weights of the embedding network. 14. The apparatus of claim 12 , wherein the operations further comprise, for in-domain test data, performing an average pooling operation on each embedding per label to generate a respective prototypical vector. 15. The apparatus of claim 12 , wherein the operations further comprise determining a particular label that has a highest degree of similarity α(x i in , S l i in ) to the in-domain test vector. 16. The apparatus of claim 12 , wherein the operations further comprise classifying a particular test vector as “out-of-domain” if a similarity α(x i in , S l i in ) between the particular test vector and each prototypical vector is lower than a threshold. 17. A computer program product for training an encoder, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to c

Assignees

Inventors

Classifications

  • G06F17/16Primary

    Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization (matrix transposition G06F7/78)} · CPC title

  • Transfer learning · CPC title

  • Hyperparameter optimisation; Meta-learning; Learning-to-learn · CPC title

  • Feedforward networks · CPC title

  • Supervised learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11645514B2 cover?
A computer-implemented method includes using an embedding network to generate prototypical vectors. Each prototypical vector is based on a corresponding label associated with a first domain. The computer-implemented method also includes using the embedding network to generate an in-domain test vector based on at least one data sample from a particular label associated with the first domain and …
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F17/16. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 09 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).