Stochastic categorical autoencoder network

US10679129B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10679129-B2
Application numberUS-201816124977-A
CountryUS
Kind codeB2
Filing dateSep 7, 2018
Priority dateSep 28, 2017
Publication dateJun 9, 2020
Grant dateJun 9, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Computer systems and methods generate a stochastic categorical autoencoder learning network (SCAN). The SCAN is trained to have an encoder network that outputs, subject to one or more constraints, parameters for parametric probability distributions of sample random variables from input data. The parameters comprise measures of central tendency and measures of dispersion. The one or more constraints comprise a first constraint that constrains a measure of a magnitude of a vector of the measures of central tendency as compared to a measure of a magnitude of a vector of the measures of dispersion. Thereafter, the sample random variables are generated from the parameters and a decoder is trained to output the input data from the sample random variables.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer system for generating a stochastic categorical autoencoder network, the computer system comprising: a set of one or more processor cores; and computer memory in communication with the set of processor cores, wherein the computer memory stores software that when executed by the set of processor cores, causes the set of processor cores to train the stochastic categorical autoencoder network by performing steps that comprise: training an encoder network to output, subject to one or more constraints, parameters for parametric probability distributions of sample random variables from input data, wherein: the parameters comprise measures of central tendency and measures of dispersion; latent variables for the parametric probability distributions are unregularized such that the measures of central tendency tend to grow larger in magnitude relative to the measures of dispersion, subject to the one or more constraints; and the one or more constraints comprise a first constraint that constrains a measure of a magnitude of a vector of the measures of central tendency such that the measure of the magnitude of the vector of the measures of central tendency cannot grow arbitrarily large relative to a measure of a magnitude of a vector of the measures of dispersion; generating the sample random variables from the parameters; and training a decoder to output the input data from the sample random variables. 2. The computer system of claim 1 , wherein: the encoder comprises a neural network; and the decoder comprises a neural network. 3. The computer system of claim 2 , wherein the first constraint is that the measure of the magnitude of the vector of the measures of central tendency must be less than or equal to a first threshold value and the measure of the magnitude of the vector the measures of dispersion must be greater than or equal to a second threshold value. 4. The computer system of claim 3 , wherein: the measures of central tendency comprise means; and the measures of dispersion comprise standard deviations. 5. The computer system of claim 3 , wherein the first threshold value is equal to the second threshold value. 6. The computer system of claim 2 , wherein: the measure of the magnitude of the vector the measure of dispersion is a pre-specified value; and the encoder is trained to generate the measures of central tendency based on the pre-specified value for the magnitude of the vector the measure of dispersion. 7. The computer system of claim 2 , wherein: the measure of the magnitude of the vector of the measures of central tendency comprises a norm measure; and the measure of the magnitude of the vector of the measures of dispersion comprises a norm measure. 8. The computer system of claim 7 , wherein: the measures of central tendency comprise means; and the measures of dispersion comprise standard deviations. 9. The computer system of claim 7 , wherein the first constraint is that the measure of the magnitude of the vector of the measures of central tendency must be less than or equal to a first threshold value and the measure of the magnitude of the vector the measures of dispersion must be greater than or equal to a second threshold value. 10. The computer system of claim 7 , wherein the norm measure for the measure of the magnitude of the vector of the measures of central tendency is different from the norm measure for the measure of the magnitude of the vector of the measures of dispersion. 11. The computer system of claim 7 , wherein each of the norm measures for the measure of the magnitude of the vector of the measure of central tendency and the measure of the magnitude of the vector of the measures of dispersion comprises a norm measure selected from the group consisting of a sup norm, a L1 norm and a L2 norm. 12. The computer system of claim 2 , wherein the measures of central tendency comprise means. 13. The computer system of claim 12 , wherein the measures of dispersion comprise standard deviations. 14. The computer system of claim 1 , wherein the probability distributions comprise independent Gaussian probability distributions. 15. The computer system of claim 1 , wherein the probability distributions comprise Bernoulli distributions. 16. The computer system of claim 1 , wherein the probability distributions comprise Poisson distributions. 17. The computer of claim 1 , wherein the probability distributions comprise uniform distributions. 18. The computer system of claim 2 , wherein the computer memory stores software that when executed by the set of processor cores, further causes the set of processor cores to augment a selected set of data by training, at least once, the stochastic categorical autoencoder network with the selected set of data to produce the augmented data, wherein training the stochastic categorical autoencoder network comprises training the stochastic categorical autoencoder network with a number of hyperparameters, including: a first hyperparameter that controls soft-tying nodes in the stochastic categorical autoencoder network; and a second hyperparameter that controls influence weights for data examples in the selected set of data. 19. The computer system of claim 18 , wherein the computer memory stores software that when executed by the set of processor cores, causes the set of processor cores to augment the selected set of data by repetitively training the stochastic categorical autoencoder network with selected set of data, with each repetitive training after a first training using at least one different hyperparameter than an immediately prior training. 20. The computer system of claim 2 , wherein the computer memory stores software that when executed by the set of processor cores, causes the set of processor cores to: implement a degradation regression system that is trained to estimate an amount of degradation in a pattern that is due to noise; and implement a denoising system that is trained to remove noise in the output of the decoder, wherein training the stochastic categorical autoencoder network comprises back-propagation through the degradation regression system and through the denoising system. 21. A method for generating a stochastic categorical autoencoder network, the method comprising training, with a computer system comprising one or more processor cores, the stochastic categorical autoencoder network, wherein training the stochastic categorical autoencoder network comprises: training an encoder network to output, subject to one or more constraints, parameters for parametric probability distributions of sample random variables from input data, wherein: the parameters comprise measures of central tendency and measures of dispersion; latent variables for the parametric probability distributions are unregularized such that the measures of central tendency tend to grow larger in magnitude relative to the measures of dispersion, subject to the one or more constraints; and the one or more constraints comprise a first constraint that constrains a measure of a magnitude of a vector of the measures of central tendency such that the measure of the magnitude of the vector of the measures of central tendency cannot grow arbitrarily large relative to a measure of a magnitude of a vector of the measures of dispersion; generating the sample random variables from the parameters; and training a decoder to output the input data from the sample random variables. 22. The

Assignees

Inventors

Classifications

  • Physics · mapped topic

  • Backpropagation, e.g. using gradient descent · CPC title

  • Physics · mapped topic

  • Physics · mapped topic

  • G06N3/088Primary

    Non-supervised learning, e.g. competitive learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10679129B2 cover?
Computer systems and methods generate a stochastic categorical autoencoder learning network (SCAN). The SCAN is trained to have an encoder network that outputs, subject to one or more constraints, parameters for parametric probability distributions of sample random variables from input data. The parameters comprise measures of central tendency and measures of dispersion. The one or more constra…
Who is the assignee on this patent?
D5Ai Llc
What technology area does this patent fall under?
Primary CPC classification G06N3/088. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 09 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).