What technology area does this patent fall under?

Primary CPC classification G10L15/1815. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Mar 02 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Cross-domain multi-task learning for text classification

US10937416B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10937416-B2
Application number	US-201916265740-A
Country	US
Kind code	B2
Filing date	Feb 1, 2019
Priority date	Feb 1, 2019
Publication date	Mar 2, 2021
Grant date	Mar 2, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method includes providing input text to a plurality of multi-task learning (MTL) models corresponding to a plurality of domains. Each MTL model is trained to generate an embedding vector based on the input text. The method further includes providing the input text to a domain identifier that is trained to generate a weight vector based on the input text. The weight vector indicates a classification weight for each domain of the plurality of domains. The method further includes scaling each embedding vector based on a corresponding classification weight of the weight vector to generate a plurality of scaled embedding vectors, generating a feature vector based on the plurality of scaled embedding vectors, and providing the feature vector to an intent classifier that is trained to generate, based on the feature vector, an intent classification result associated with the input text.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method of cross-domain multi-task learning (MTL) for text classification, the method comprising: obtaining input text; providing the input text to a plurality of MTL models corresponding to a plurality of domains to generate a plurality of embedding vectors corresponding to each domain of the plurality of domains based on the input text, wherein each MTL model is trained based on text samples corresponding to a respective domain of the plurality of domains; providing the input text to a domain identifier, the domain identifier trained, based on the text samples associated with the plurality of domains, to generate a weight vector based on the input text, the weight vector indicating a classification weight for each domain of the plurality of domains, the classification weight for a particular domain associated with a probability that the input text is associated with the particular domain; scaling each embedding vector based on a corresponding classification weight of the weight vector to generate a plurality of scaled embedding vectors; generating a feature vector based on the plurality of scaled embedding vectors; and providing the feature vector to an intent classifier, the intent classifier trained to generate, based on the feature vector, an intent classification result associated with the input text. 2. The computer-implemented method of claim 1 , wherein generating the feature vector includes concatenating the plurality of scaled embedding vectors. 3. The computer-implemented method of claim 1 , further comprising generating one or more natural language processing (NLP) features based on the input text. 4. The computer-implemented method of claim 3 , wherein generating the feature vector includes concatenating the scaled embedding vectors and the one or more NLP features. 5. The computer-implemented method of claim 1 , further comprising receiving an audio speech input, wherein the input text is determined based on the audio speech input. 6. The computer-implemented method of claim 1 , wherein a first MTL model of the plurality of MTL models is trained using first labeled training data associated with a first domain of the plurality of domains, wherein a second MTL model of the plurality of MTL models is trained using second labeled training data associated with a second domain of the plurality of domains, the second labeled training data distinct from the first labeled training data, and wherein the domain identifier is trained using both the first labeled training data and the second labeled training data. 7. The computer-implemented method of claim 1 , wherein a first MTL model of the plurality of MTL models is trained using first labeled training data associated with a first domain of the plurality of domains, wherein a second MTL model of the plurality of MTL models is trained using second labeled training data associated with a second domain of the plurality of domains, the second labeled training data distinct from the first labeled training data, and wherein the domain identifier is trained using cross-domain training data that is independent of the first labeled training data and the second labeled training data. 8. The computer-implemented method of claim 1 , wherein the domain identifier is further trained to determine, for each particular domain of the plurality of domains, the probability that the input text is associated with the particular domain and to generate the weight vector based on whether the probability satisfies a threshold. 9. The computer-implemented method of claim 1 , wherein each MTL model includes a convolutional neural network with one or more max pooling layers. 10. An apparatus comprising: a memory; and a processor coupled to the memory and configured to: obtain input text; provide the input text to a plurality of MTL models corresponding to a plurality of domains to generate a plurality of embedding vectors corresponding to each domain of the plurality of domains based on the input text, wherein each MTL model is trained based on text samples corresponding to a respective domain of the plurality of domains; provide the input text to a domain identifier, the domain identifier trained, based on the text samples associated with the plurality of domains, to generate a weight vector based on the input text, the weight vector indicating a classification weight for each domain of the plurality of domains, the classification weight for a particular domain associated with a probability that the input text is associated with the particular domain; scale each embedding vector based on a corresponding classification weight of the weight vector to generate a plurality of scaled embedding vectors; generate a feature vector based on the plurality of scaled embedding vectors; and provide the feature vector to an intent classifier, the intent classifier configured to generate, based on the feature vector, an intent classification result associated with the input text. 11. The apparatus of claim 10 , wherein the processor includes a feature vector generator configured to generate the feature vector by concatenating the plurality of scaled embedding vectors. 12. The apparatus of claim 11 , wherein the processor is further configured to generate one or more natural language processing (NLP) features based on the input text. 13. The apparatus of claim 12 , wherein the feature vector generator is further configured to generate the feature vector by concatenating the scaled embedding vectors and the one or more NLP features. 14. The apparatus of claim 10 , wherein each MTL model includes a convolutional neural network with one or more max pooling layers. 15. A computer-readable storage medium storing instructions executable by a processor to perform, initiate, or control operations, the operations comprising: obtaining input text; providing the input text to a plurality of MTL models corresponding to a plurality of domains to generate a plurality of embedding vectors corresponding to each domain of the plurality of domains based on the input text, wherein each MTL model is trained based on text samples corresponding to a respective domain of the plurality of domains; providing the input text to a domain identifier, the domain identifier trained, based on the text samples associated with the plurality of domains, to generate a weight vector based on the input text, the weight vector indicating a classification weight for each domain of the plurality of domains, the classification weight for a particular domain associated with a probability that the input text is associated with the particular domain; scaling each embedding vector based on a corresponding classification weight of the weight vector to generate a plurality of scaled embedding vectors; generating a feature vector based on the plurality of scaled embedding vectors; and providing the feature vector to an intent classifier, the intent classifier trained to generate, based on the feature vector, an intent classification result associated with the input text. 16. The computer-readable storage medium of claim 15 , wherein the processor is further configured to determine the input text based on audio speech input. 17. The computer-readable storage medium of claim 15 , wherein a first MTL model of the plurality of MTL models is trained using first labeled training data associated with a first domain of the plurality of domains, wherein a second MTL model of the plurality of MTL models is trained using second labeled training data associated with

Assignees

Inventors

Classifications

G06N3/044
Recurrent networks, e.g. Hopfield networks · CPC title
G06N3/045
Combinations of networks · CPC title
G06N3/096
Transfer learning · CPC title
G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title
G06N3/09
Supervised learning · CPC title

Patent family

Related publications grouped by family.

View patent family 71837829

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10937416B2 cover?: A method includes providing input text to a plurality of multi-task learning (MTL) models corresponding to a plurality of domains. Each MTL model is trained to generate an embedding vector based on the input text. The method further includes providing the input text to a domain identifier that is trained to generate a weight vector based on the input text. The weight vector indicates a classifi…
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification G10L15/1815. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Mar 02 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).