Representation learning for input classification via topic sparse autoencoder and entity embedding

US11615311B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11615311-B2
Application numberUS-201916691554-A
CountryUS
Kind codeB2
Filing dateNov 21, 2019
Priority dateDec 10, 2018
Publication dateMar 28, 2023
Grant dateMar 28, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Described herein are embodiments of a unified neural network framework to integrate Topic modeling, Word embedding and Entity Embedding (TWEE) for representation learning of inputs. In one or more embodiments, a novel topic sparse autoencoder is introduced to incorporate discriminative topics into the representation learning of the input. Topic distributions of inputs are generated from a global viewpoint and are utilized to enable autoencoder to learn topical representations. A sparsity constraint may be added to ensure that the most discriminative representations are related to topics. In addition, both words and entity related information may be embedded into the network to help learn a more comprehensive input representation. Extensive empirical experiments show that embodiments of the TWEE framework outperform the state-of-the-art methods on different datasets.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for classifying an input comprising a set of words, the method comprising: feeding the input into a topic modeling, word embedding, and entity embedding (TWEE) model; obtaining a topic embedding of the input that reflects a distribution of topics in the input; obtaining a word embedding of the input that considers local context information of the input; obtaining an entity embedding corresponding to one or more entities in the input; concatenating at least the topic embedding and the word embedding to obtain a concatenated representation; and processing the concatenated representation through one or more neural network layers to obtain a classification of the input by performing steps comprising: feeding the concatenated representation into a convolutional layer comprising multiple filters to detect features at different positions; applying a max-pooling on representations obtained from the detected features to select a set of features; employing a neural network (NN) layer for sequential processing of the set of features; feeding a hidden state output at the last time step of the (NN) layer into a fully connected layer; and applying a prediction layer to an output obtained from the fully connected layer to obtain the classification of the input. 2. The computer-implemented method of claim 1 wherein obtaining a topic embedding of the input is performed using a topic sparse autoencoder (TSAE) that performs steps comprising: generating a topic distribution over the input among one or more topics by topic modeling; obtaining a topic distribution for words based on the topic distribution over the input; encoding, via an encoder, the input into a hidden representation comprising one or more word embeddings; using the topic distribution for words to form a topic distribution over hidden state; and including the topic distribution over hidden state in an overall cost function of the TSAE to update encoder parameters and obtain the topic embedding of the input. 3. The computer-implemented method of claim 2 wherein the topic modeling comprises a pre-trained probabilistic topic model. 4. The computer-implemented method of claim 2 wherein the overall cost function of TSAE comprises a topic guidance term based on a sum of Kullback-Leibler (KL) divergences divergence between a topic sparsity parameter for the hidden representations and an average activation of a hidden layer for each topic. 5. The computer-implemented method of claim 1 wherein the concatenated representation is obtained by concatenating the topic embedding, the word embedding, and the entity embedding into the concatenated representation. 6. The computer-implemented method of claim 1 wherein the input is a question and the classification identifies a question type for the input. 7. The computer-implemented method of claim 1 wherein the neural network layer comprises a long short-term memory (LSTM) layer. 8. The computer-implemented method of claim 2 wherein the TSAE is an unsupervised feedforward neural network trained by applying backpropagation by fitting the input using a decoded representation for the input, the overall cost function of the TSAE comprises a term for an average of reconstruction loss. 9. The computer-implemented method of claim 1 wherein a cross entropy loss is calculated for input classification and backpropagations are made to train the TWEE model. 10. A computer-implemented method for classifying an input comprising a set of words, the method comprising: obtaining, at a topic sparse autoencoder (TSAE), a topic embedding of the input using steps comprising: generating a topic distribution over the input among one or more topics by topic modeling; obtaining a topic distribution for words based on the topic distribution over the input; encoding, via an encoder, the input into a hidden representation comprising one or more word embeddings; using the topic distribution for words to form a topic distribution over hidden state; and including the topic distribution over hidden state in an overall cost function of the TSAE to update encoder parameters and obtain the topic embedding of the input; obtaining a word embedding of the input that considers local context information of the input; and obtaining a classification of the input based on at least the topic embedding and the word embedding by performing steps comprising: concatenating at least the topic embedding and the word embedding to form a concatenated representation; feeding the concatenated representation into a convolutional layer to detect features; selecting a set of features from the detected features; employing a neural network layer for sequential processing of the set of features; feeding a hidden state output at the last time step of the neural network layer into a fully connected layer; and applying a prediction layer to an output obtained from the fully connected layer to obtain the classification of the input. 11. The computer-implemented method of claim 10 wherein the TSAE is an unsupervised feedforward neural network trained by applying backpropagation by fitting the input using a decoded representation for the input, the overall cost function of the TSAE comprises a term for an average of reconstruction loss. 12. The computer-implemented method of claim 10 wherein the overall cost function of TSAE further comprises a topic guidance term based on a sum of Kullback-Leibler (KL) divergences between a topic sparsity parameter for the hidden representations and an average activation of a hidden layer for each topic. 13. The computer-implemented method of claim 10 wherein the input is a question and the classification identifies a question type for the input. 14. The computer-implemented method of claim 10 wherein the neural network layer comprises a long short-term memory (LSTM) layer. 15. The computer-implemented method of claim 10 wherein the word embedding of the input is obtained using a skip-gram model using stochastic gradient descent with negative sampling. 16. A non-transitory computer-readable medium or media comprising one or more sequences of instructions which, when executed by one or more processors, causes the steps for classifying an input comprising a set of words to be performed comprising: obtaining, using a topic sparse autoencoder, a topic embedding of the input that reflects a distribution of topics in the input; obtaining a word embedding of the input that considers local context information of the input; obtaining an entity embedding corresponding to one or more entities in the input; concatenating the topic embedding, the word embedding, and the entity embedding into a mixture embedding; and processing, through a classifier comprising one or more neural network layers, the mixture embedding to obtain a classification of the input by performing steps comprising: feeding the concatenated representation into a convolutional layer comprising multiple filters to detect features at different positions; applying a max-pooling on representations obtained from the detected features to select a set of features; employing a neural network layer for processing of the set of features; feeding a hidden state output at the last time step of the neural network layer into a fully connected layer; and applying a prediction layer to an output obtained from the fully connected layer to obtain the classification of the input. 17. The non-transitory computer-readable medium or media of claim 16 wherein obtaining a to

Assignees

Inventors

Classifications

  • Supervised learning · CPC title

  • Quantised networks; Sparse networks; Compressed networks · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Auto-encoder networks; Encoder-decoder networks · CPC title

  • characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11615311B2 cover?
Described herein are embodiments of a unified neural network framework to integrate Topic modeling, Word embedding and Entity Embedding (TWEE) for representation learning of inputs. In one or more embodiments, a novel topic sparse autoencoder is introduced to incorporate discriminative topics into the representation learning of the input. Topic distributions of inputs are generated from a globa…
Who is the assignee on this patent?
Baidu Usa Llc
What technology area does this patent fall under?
Primary CPC classification G06N3/084. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 28 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).