Semisupervised autoencoder for sentiment analysis

US11205103B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11205103-B2
Application numberUS-201715838000-A
CountryUS
Kind codeB2
Filing dateDec 11, 2017
Priority dateDec 9, 2016
Publication dateDec 21, 2021
Grant dateDec 21, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method of modelling data, comprising: training an objective function of a linear classifier, based on a set of labeled data, to derive a set of classifier weights; defining a posterior probability distribution on the set of classifier weights of the linear classifier; approximating a marginalized loss function for an autoencoder as a Bregman divergence, based on the posterior probability distribution on the set of classifier weights learned from the linear classifier; and classifying unlabeled data using the autoencoder according to the marginalized loss function.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of modelling data, comprising: training an objective function of a linear classifier, based on a set of labeled data, to derive a set of classifier weights; defining a posterior probability distribution on the set of classifier weights of the linear classifier; approximating a marginalized loss function for an autoencoder as a Bregman divergence, based on the posterior probability distribution on the set of classifier weights learned from the linear classifier; and automatically classifying unlabeled data using a compact classifier according to the marginalized loss function. 2. The method according to claim 1 , wherein the marginalized loss function is: D ( {tilde over (x)},x )= E θ˜p(θ) (θ T ( {tilde over (x)}−x )) 2 =∫(θ T ( {tilde over (x)}−x )) 2 p (θ) dθ wherein E θ˜p(θ) is an expectation, θ are the classifier weights, and x are the data points. 3. The method according to claim 1 , wherein the autoencoder comprises a neural network, wherein said training comprises training the neural network. 4. The method according to claim 1 , wherein the autoencoder comprises a denoising autoencoder. 5. The method according to claim 4 , wherein the denoising autoencoder is denoised stochastically, and comprises a neural network employing stochastic gradient descent training using randomly selected data samples, wherein a gradient is calculated using back propagation of errors. 6. The method according to claim 1 , wherein said training comprises training the objective function of the linear classifier with a bag of words, wherein the linear classifier comprises a support vector machine classifier with squared hinge loss and l 2 regularization. 7. The method according to claim 1 , wherein said training comprises training the objective function of the linear classifier with a bag of words, wherein the linear classifier comprises a Logistic Regression classifier. 8. The method according to claim 1 , wherein the Bregman divergence is determined assuming that all data samples induce a loss. 9. The method according to claim 1 , wherein the posterior probability distribution on the set of classifier weights is estimated using with a Laplace approximation, wherein the Laplace approximation stochastically estimates the set of classifier weights using a covariance matrix constrained to be diagonal. 10. The method according to claim 1 , wherein the posterior probability distribution on the set of classifier weights is estimated using with a Markov chain Monte Carlo method. 11. A system for modelling data, comprising: an input port, configured to receive a set of labelled data; a linear classifier; an autoencoder; a compact classifier; and an output port, configured to communicate a classification of at least one unlabeled datum, wherein: an objective function of a linear classifier is automatically trained, based on the set of labeled data, to derive a set of classifier weights; a marginalized loss function for the autoencoder is approximated as a Bregman divergence, based on a posterior probability distribution on the set of classifier weights learned from the linear classifier; and the at least one unlabeled datum is classified using the compact classifier according to the marginalized loss function. 12. The system according to claim 11 , wherein the marginalized loss function is: D ( {tilde over (x)},x )= E θ˜p(θ) (θ T ( {tilde over (x)}−x )) 2 =∫(θ T ( {tilde over (x)}−x )) 2 p (θ) dθ wherein E θ˜p(θ) is an expectation, θ are the classifier weights, and x are the data points. 13. The system according to claim 11 , wherein the autoencoder comprises a neural network. 14. The system according to claim 11 , wherein the autoencoder comprises a denoising autoencoder. 15. The system according to claim 14 , wherein the denoising autoencoder is denoised stochastically, and comprises a neural network trained according to stochastic gradient descent training using randomly selected data samples, wherein a gradient is calculated using back propagation of errors. 16. The system according to claim 11 , wherein the objective function of the linear classifier is trained with a bag of words, wherein the linear classifier comprises a support vector machine classifier with squared hinge loss and l 2 regularization. 17. The system according to claim 11 , wherein the objective function of the linear classifier is trained with a bag of words, wherein the linear classifier comprises a Logistic Regression classifier. 18. The system according to claim 11 , wherein the Bregman divergence is determined assuming that all data samples induce a loss. 19. The system according to claim 11 , wherein the posterior probability distribution on the set of classifier weights is automatically estimated using a technique selected from the group consisting of a Laplace approximation, wherein the Laplace approximation stochastically estimates the set of classifier weights using a covariance matrix constrained to be diagonal, and a Markov chain Monte Carlo method. 20. A non-transitory computer readable medium containing instructions for controlling at least one programmable automated processor to model data, comprising: instructions for training an objective function of a linear classifier, based on a set of labeled data, to derive a set of classifier weights; instructions for defining a posterior probability distribution on the set of classifier weights of the linear classifier; instructions for approximating a marginalized loss function for an autoencoder as a Bregman divergence D f ({tilde over (x)},x)=ƒ({tilde over (x)})−ƒ(x)+∇ƒ(x) T ({tilde over (x)}−x)), wherein {tilde over (x)},x∈R d are two datapoints, ƒ(x) is a convex function defined on R d , based on the posterior probability distribution on the set of classifier weights learned from the linear classifier, wherein θ∈R d are the weights of the linear classifier, and D({tilde over (x)},x)=E θ˜p(θ) (θ T ({tilde over (x)}−x)) 2 =∫(θ T ({tilde over (x)}−x)) 2 p(θ)dθ is the marginalized loss function given p(θ) as an expectation over θ, which is approximated using: D ⁡ ( x ~ , x ) = ⁢ E θ ~ p ~ ⁡ ( θ ) ⁡

Assignees

Inventors

Classifications

  • Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

  • using classification, e.g. of video objects · CPC title

  • based on the proximity to a decision surface, e.g. support vector machines · CPC title

  • G06N20/10Primary

    using kernel methods, e.g. support vector machines [SVM] · CPC title

  • Probabilistic graphical models, e.g. probabilistic networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11205103B2 cover?
A method of modelling data, comprising: training an objective function of a linear classifier, based on a set of labeled data, to derive a set of classifier weights; defining a posterior probability distribution on the set of classifier weights of the linear classifier; approximating a marginalized loss function for an autoencoder as a Bregman divergence, based on the posterior probability dist…
Who is the assignee on this patent?
Univ New York State Res Found, The Research Foundation For The State Univ
What technology area does this patent fall under?
Primary CPC classification G06N20/10. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 21 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).