Method and system for text classification based on learning of transferable feature representations from a source domain
US-2018174071-A1 · Jun 21, 2018 · US
US11645514B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11645514-B2 |
| Application number | US-201916530457-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 2, 2019 |
| Priority date | Aug 2, 2019 |
| Publication date | May 9, 2023 |
| Grant date | May 9, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A computer-implemented method includes using an embedding network to generate prototypical vectors. Each prototypical vector is based on a corresponding label associated with a first domain. The computer-implemented method also includes using the embedding network to generate an in-domain test vector based on at least one data sample from a particular label associated with the first domain and using the embedding network to generate an out-of-domain test vector based on at least one other data sample associated with a different domain. The computer-implemented method also includes comparing the prototypical vectors to the in-domain test vector to generate in-domain comparison values and comparing the prototypical vectors to the out-of-domain test vector to generate out-of-domain comparison values. The computer-implemented method also includes modifying, based on the in-domain comparison values and the out-of-domain comparison values, one or more parameters of the embedding network.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method of training an encoder, the computer-implemented method comprising: using an embedding network to generate prototypical vectors, each prototypical vector based on a corresponding label associated with a first domain; using the embedding network to generate an in-domain test vector based on at least one data sample from a particular label associated with the first domain; using the embedding network to generate an out-of-domain test vector based on at least one other data sample associated with a different domain; comparing the prototypical vectors to the in-domain test vector to generate in-domain comparison values; comparing the prototypical vectors to the out-of-domain test vector to generate out-of-domain comparison values; and modifying, based on the in-domain comparison values and the out-of-domain comparison values, one or more parameters of the embedding network to generate one or more modified parameters for the embedding network. 2. The computer-implemented method of claim 1 , wherein modifying the one or more parameters of the embedding network comprises modifying one or more weights of the embedding network. 3. The computer-implemented method of claim 1 , further comprising determining a maximum likelihood of a true label based on the in-domain test vector, the in-domain test vector generated based on a training data sample selected from a label associated with the first domain. 4. The computer-implemented method of claim 3 , wherein the maximum likelihood of the true label is determined based on L in = expa ( x i in , s l i in ) ∑ l expa ( x i in , s l i in ) . 5. The computer-implemented method of claim 3 , wherein modifying the one or more parameters of the embedding network comprises: selecting particular parameters that minimize a distance between the in-domain test vector and a particular prototypical vector associated with the maximum likelihood of the true label, wherein the particular parameters correspond to the one or more modified parameters. 6. The computer-implemented method of claim 5 , wherein minimizing the distance between the out-of-domain test vector and the particular prototypical vector is based on L gt =max[0, M 2 −max(F(x i in , S i in ))]. 7. The computer-implemented method of claim 1 , wherein modifying the one or more parameters of the embedding network comprises: selecting particular parameters that maximize distances between the out-of-domain test vector and the prototypical vectors, wherein the particular parameters correspond to the one or more modified parameters. 8. The computer-implemented method of claim 7 , wherein maximizing the distances between the out-of-domain test vector and the prototypical vectors is based on L ood =max[0, max(F(x j out , S l in )−M 1 )]. 9. The computer-implemented method of claim 1 , further comprising: randomly selecting a first group of one or more data samples from a first label associated with the first domain; and randomly selecting a second group of one or more data samples from a second label associated with the first domain. 10. The computer-implemented method of claim 9 , further comprising: encoding, using the embedding network, each data sample in the first group of one or more data samples to generate corresponding first sample vectors; and encoding, using the embedding network, each data sample in the second group of one or more data samples to generate corresponding second sample vectors. 11. The computer-implemented method of claim 10 , further comprising: performing an average-pooling operation on the first sample vectors to generate a first prototypical vector; and performing the average-pooling operation on the second sample vectors to generate a second prototypical vector, wherein the prototypical vectors include at least the first prototypical vector and the second prototypical vector. 12. An apparatus comprising: a processor; and a memory coupled to the processor and storing instructions that, when executed by the processor, cause the processor to perform operations comprising: using an embedding network to generate prototypical vectors, each prototypical vector based on a corresponding label associated with a first domain; using the embedding network to generate an in-domain test vector based on at least one data sample from a particular label associated with the first domain; using the embedding network to generate an out-of-domain test vector based on at least one other data sample associated with a different domain; comparing the prototypical vectors to the in-domain test vector to generate in-domain comparison values; comparing the prototypical vectors to the out-of-domain test vector to generate out-of-domain comparison values; and modifying, based on the in-domain comparison values and the out-of-domain comparison values, one or more parameters of the embedding network to generate one or more modified parameters for the embedding network. 13. The apparatus of claim 12 , wherein modifying the one or more parameters of the embedding network comprises modifying one or more weights of the embedding network. 14. The apparatus of claim 12 , wherein the operations further comprise, for in-domain test data, performing an average pooling operation on each embedding per label to generate a respective prototypical vector. 15. The apparatus of claim 12 , wherein the operations further comprise determining a particular label that has a highest degree of similarity α(x i in , S l i in ) to the in-domain test vector. 16. The apparatus of claim 12 , wherein the operations further comprise classifying a particular test vector as “out-of-domain” if a similarity α(x i in , S l i in ) between the particular test vector and each prototypical vector is lower than a threshold. 17. A computer program product for training an encoder, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to c
Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization (matrix transposition G06F7/78)} · CPC title
Transfer learning · CPC title
Hyperparameter optimisation; Meta-learning; Learning-to-learn · CPC title
Feedforward networks · CPC title
Supervised learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.