Hybrid model for short text classification with imbalanced data

US11328221B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11328221-B2
Application numberUS-201916379192-A
CountryUS
Kind codeB2
Filing dateApr 9, 2019
Priority dateApr 9, 2019
Publication dateMay 10, 2022
Grant dateMay 10, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method of text classification includes generating a text embedding vector representing a text sample and applying weights of a regression layer to the text embedding vector to generate a first data model output vector. The method also includes generating a plurality of prototype embedding vectors associated with a respective classification labels and comparing the plurality of prototype embedding vectors to the text embedding vector to generate a second data model output vector. The method further includes assigning a particular classification label to the text sample based on the first data model output vector, the second data model output vector, and one or more weighting values.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method of assigning classification labels to text, the computer-implemented method comprising: generating, using embedding layers of a trained classification network, a text embedding vector representing a text sample; applying weights of a regression layer of the trained classification network to the text embedding vector to generate a first data model output vector, each value of the first data model output vector representative of a first estimate of a probability that the text sample is associated with a class indicated by a respective classification label of a plurality of classification labels; generating, based on training text samples associated with the plurality of classification labels, a plurality of prototype embedding vectors, each prototype embedding vector of the plurality of prototype embedding vectors associated with a respective classification label; comparing the plurality of prototype embedding vectors to the text embedding vector to generate a second data model output vector, each value of the second data model output vector representative of a second estimate of the probability that the text sample is associated with the class; and assigning a particular classification label of the plurality of classification labels to the text sample based on the first data model output vector, the second data model output vector, and one or more weighting values. 2. The computer-implemented method of claim 1 , further comprising: obtaining a plurality of training text samples including one or more representative training text samples for each class of a plurality of classes; and generating, using the embedding layers, a plurality of training text embedding vectors including a training text embedding vector representing each representative training text sample of the plurality of training text samples, wherein the plurality of prototype embedding vectors are generated using the plurality of training text embedding vectors. 3. The computer-implemented method of claim 2 , further comprising determining, based on the text sample and the plurality of training text samples, a plurality of weighting values including the one or more weighting values. 4. The computer-implemented method of claim 2 , wherein generating a prototype embedding vector corresponding to a particular class of the plurality of classes includes aggregating a set of training text embedding vectors corresponding to the particular class. 5. The computer-implemented method of claim 2 , wherein the plurality of training text samples include or correspond to labeled training data used to train the trained classification network. 6. The computer-implemented method of claim 5 , wherein training text samples representing each class of the plurality of classes are randomly or pseudo-randomly selected from a subset of the labeled training data representing the corresponding class. 7. The computer-implemented method of claim 1 , further comprising: combining, element-by-element, weighted values of the first data model output vector and the second data model output vector to generate a combined data model output vector; and normalizing elements of the combined data model output vector to generate a combined probability estimate vector including a plurality of probability estimate elements, each probability estimate elements associated with a respective classification label of the plurality of classification labels, wherein the particular classification label is assigned to the text sample based on a probability estimate element associated with the particular classification label having a highest combined probability estimate value among the plurality of probability estimate elements. 8. The computer-implemented method of claim 1 , further comprising: before generating the text embedding vector, receiving the text sample via a query input; and determining the one or more weighting values based at least partially on the text sample. 9. A system for assigning classification labels to text, the system comprising: one or more processors; and one or more memory devices coupled to the one or more processors, the one or more memory devices storing instructions that are executable by the one or more processors to perform operations including: generating, using embedding layers of a trained classification network, a text embedding vector representing a text sample; applying weights of a regression layer of the trained classification network to the text embedding vector to generate a first data model output vector, each value of the first data model output vector representative of a first estimate of a probability that the text sample is associated with a class indicated by a respective classification label of a plurality of classification labels; generating, based on training text samples associated with the plurality of classification labels, a plurality of prototype embedding vectors, each prototype embedding vector of the plurality of prototype embedding vectors associated with a respective classification label; comparing the plurality of prototype embedding vectors to the text embedding vector to generate a second data model output vector, each value of the second data model output vector representative of a second estimate of the probability that the text sample is associated with the class; and assigning a particular classification label of the plurality of classification labels to the text sample based on the first data model output vector, the second data model output vector, and one or more weighting values. 10. The system of claim 9 , wherein the operations further comprise: obtaining a plurality of training text samples including one or more representative training text samples for each class of a plurality of classes; and generating, using the embedding layers, a plurality of training text embedding vectors including a training text embedding vector representing each representative training text sample of the plurality of training text samples, wherein the plurality of prototype embedding vectors are generated using the plurality of training text embedding vectors. 11. The system of claim 10 , wherein the operations further comprise determining, based on the text sample and the plurality of training text samples, a plurality of weighting values including the one or more weighting values. 12. The system of claim 10 , wherein generating a prototype embedding vector corresponding to a particular class of the plurality of classes includes aggregating a set of training text embedding vectors corresponding to the particular class. 13. The system of claim 9 , wherein the operations further comprise: combining, element-by-element, weighted values of the first data model output vector and the second data model output vector to generate a combined data model output vector; and normalizing elements of the combined data model output vector to generate a combined probability estimate vector including a plurality of probability estimate elements, each probability estimate element associated with a respective classification label of the plurality of classification labels, wherein the particular classification label is assigned to the text sample based on a probability estimate element associated with the particular classification label having a highest combined probability estimate value among the plurality of probability estimate elements. 14. The system of claim 9 , wherein the operations further comprise: before generating the text embedding vector, receiving the text sample via a query input; and determining th

Assignees

Inventors

Classifications

  • Combinations of networks · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Supervised learning · CPC title

  • G06F16/353Primary

    into predefined classes · CPC title

  • G06N20/00Primary

    Machine learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11328221B2 cover?
A method of text classification includes generating a text embedding vector representing a text sample and applying weights of a regression layer to the text embedding vector to generate a first data model output vector. The method also includes generating a plurality of prototype embedding vectors associated with a respective classification labels and comparing the plurality of prototype embed…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F16/353. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 10 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).