Real-Time On the Fly Generation of Feature-Based Label Embeddings Via Machine Learning
US-2021004693-A1 · Jan 7, 2021 · US
US11615311B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11615311-B2 |
| Application number | US-201916691554-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 21, 2019 |
| Priority date | Dec 10, 2018 |
| Publication date | Mar 28, 2023 |
| Grant date | Mar 28, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Described herein are embodiments of a unified neural network framework to integrate Topic modeling, Word embedding and Entity Embedding (TWEE) for representation learning of inputs. In one or more embodiments, a novel topic sparse autoencoder is introduced to incorporate discriminative topics into the representation learning of the input. Topic distributions of inputs are generated from a global viewpoint and are utilized to enable autoencoder to learn topical representations. A sparsity constraint may be added to ensure that the most discriminative representations are related to topics. In addition, both words and entity related information may be embedded into the network to help learn a more comprehensive input representation. Extensive empirical experiments show that embodiments of the TWEE framework outperform the state-of-the-art methods on different datasets.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method for classifying an input comprising a set of words, the method comprising: feeding the input into a topic modeling, word embedding, and entity embedding (TWEE) model; obtaining a topic embedding of the input that reflects a distribution of topics in the input; obtaining a word embedding of the input that considers local context information of the input; obtaining an entity embedding corresponding to one or more entities in the input; concatenating at least the topic embedding and the word embedding to obtain a concatenated representation; and processing the concatenated representation through one or more neural network layers to obtain a classification of the input by performing steps comprising: feeding the concatenated representation into a convolutional layer comprising multiple filters to detect features at different positions; applying a max-pooling on representations obtained from the detected features to select a set of features; employing a neural network (NN) layer for sequential processing of the set of features; feeding a hidden state output at the last time step of the (NN) layer into a fully connected layer; and applying a prediction layer to an output obtained from the fully connected layer to obtain the classification of the input. 2. The computer-implemented method of claim 1 wherein obtaining a topic embedding of the input is performed using a topic sparse autoencoder (TSAE) that performs steps comprising: generating a topic distribution over the input among one or more topics by topic modeling; obtaining a topic distribution for words based on the topic distribution over the input; encoding, via an encoder, the input into a hidden representation comprising one or more word embeddings; using the topic distribution for words to form a topic distribution over hidden state; and including the topic distribution over hidden state in an overall cost function of the TSAE to update encoder parameters and obtain the topic embedding of the input. 3. The computer-implemented method of claim 2 wherein the topic modeling comprises a pre-trained probabilistic topic model. 4. The computer-implemented method of claim 2 wherein the overall cost function of TSAE comprises a topic guidance term based on a sum of Kullback-Leibler (KL) divergences divergence between a topic sparsity parameter for the hidden representations and an average activation of a hidden layer for each topic. 5. The computer-implemented method of claim 1 wherein the concatenated representation is obtained by concatenating the topic embedding, the word embedding, and the entity embedding into the concatenated representation. 6. The computer-implemented method of claim 1 wherein the input is a question and the classification identifies a question type for the input. 7. The computer-implemented method of claim 1 wherein the neural network layer comprises a long short-term memory (LSTM) layer. 8. The computer-implemented method of claim 2 wherein the TSAE is an unsupervised feedforward neural network trained by applying backpropagation by fitting the input using a decoded representation for the input, the overall cost function of the TSAE comprises a term for an average of reconstruction loss. 9. The computer-implemented method of claim 1 wherein a cross entropy loss is calculated for input classification and backpropagations are made to train the TWEE model. 10. A computer-implemented method for classifying an input comprising a set of words, the method comprising: obtaining, at a topic sparse autoencoder (TSAE), a topic embedding of the input using steps comprising: generating a topic distribution over the input among one or more topics by topic modeling; obtaining a topic distribution for words based on the topic distribution over the input; encoding, via an encoder, the input into a hidden representation comprising one or more word embeddings; using the topic distribution for words to form a topic distribution over hidden state; and including the topic distribution over hidden state in an overall cost function of the TSAE to update encoder parameters and obtain the topic embedding of the input; obtaining a word embedding of the input that considers local context information of the input; and obtaining a classification of the input based on at least the topic embedding and the word embedding by performing steps comprising: concatenating at least the topic embedding and the word embedding to form a concatenated representation; feeding the concatenated representation into a convolutional layer to detect features; selecting a set of features from the detected features; employing a neural network layer for sequential processing of the set of features; feeding a hidden state output at the last time step of the neural network layer into a fully connected layer; and applying a prediction layer to an output obtained from the fully connected layer to obtain the classification of the input. 11. The computer-implemented method of claim 10 wherein the TSAE is an unsupervised feedforward neural network trained by applying backpropagation by fitting the input using a decoded representation for the input, the overall cost function of the TSAE comprises a term for an average of reconstruction loss. 12. The computer-implemented method of claim 10 wherein the overall cost function of TSAE further comprises a topic guidance term based on a sum of Kullback-Leibler (KL) divergences between a topic sparsity parameter for the hidden representations and an average activation of a hidden layer for each topic. 13. The computer-implemented method of claim 10 wherein the input is a question and the classification identifies a question type for the input. 14. The computer-implemented method of claim 10 wherein the neural network layer comprises a long short-term memory (LSTM) layer. 15. The computer-implemented method of claim 10 wherein the word embedding of the input is obtained using a skip-gram model using stochastic gradient descent with negative sampling. 16. A non-transitory computer-readable medium or media comprising one or more sequences of instructions which, when executed by one or more processors, causes the steps for classifying an input comprising a set of words to be performed comprising: obtaining, using a topic sparse autoencoder, a topic embedding of the input that reflects a distribution of topics in the input; obtaining a word embedding of the input that considers local context information of the input; obtaining an entity embedding corresponding to one or more entities in the input; concatenating the topic embedding, the word embedding, and the entity embedding into a mixture embedding; and processing, through a classifier comprising one or more neural network layers, the mixture embedding to obtain a classification of the input by performing steps comprising: feeding the concatenated representation into a convolutional layer comprising multiple filters to detect features at different positions; applying a max-pooling on representations obtained from the detected features to select a set of features; employing a neural network layer for processing of the set of features; feeding a hidden state output at the last time step of the neural network layer into a fully connected layer; and applying a prediction layer to an output obtained from the fully connected layer to obtain the classification of the input. 17. The non-transitory computer-readable medium or media of claim 16 wherein obtaining a to
Supervised learning · CPC title
Quantised networks; Sparse networks; Compressed networks · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
Auto-encoder networks; Encoder-decoder networks · CPC title
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.