Tomography and generative data modeling via quantum boltzmann training
US-2018165601-A1 · Jun 14, 2018 · US
US11042811B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11042811-B2 |
| Application number | US-201715725600-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 5, 2017 |
| Priority date | Oct 5, 2016 |
| Publication date | Jun 22, 2021 |
| Grant date | Jun 22, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A computational system can include digital circuitry and analog circuitry, for instance a digital processor and a quantum processor. The quantum processor can operate as a sample generator providing samples. Samples can be employed by the digital processing in implementing various machine learning techniques. For example, the computational system can perform unsupervised learning over an input space, for example via a discrete variational auto-encoder, and attempting to maximize the log-likelihood of an observed dataset. Maximizing the log-likelihood of the observed dataset can include generating a hierarchical approximating posterior. Unsupervised learning can include generating samples of a prior distribution using the quantum processor. Generating samples using the quantum processor can include forming chains of qubits and representing discrete variables by chains.
Opening claim text (preview).
The invention claimed is: 1. A method for unsupervised learning over an input space comprising discrete or continuous variables, and at least a subset of a training dataset of samples of the respective variables, to attempt to identify a value of at least one parameter that increases a log-likelihood of at least the subset of the training dataset with respect to a model, the model expressible as a function of the at least one parameter, the method executed by circuitry including at least one processor, the method comprising; forming a first latent space comprising a plurality of random variables, the plurality of random variables comprising one or more discrete random variables; forming a second latent space comprising the first latent space and a set of supplementary continuous random variables; forming a first transforming distribution comprising a conditional distribution over the set of supplementary continuous random variables, conditioned on the one or more discrete random variables of the first latent space; forming an encoding distribution comprising an approximating posterior distribution over the first latent space, conditioned on the input space; forming a prior distribution over the first latent space; forming a decoding distribution comprising a conditional distribution over the input space conditioned on the set of supplementary continuous random variables; determining an ordered set of conditional cumulative distribution functions of the supplementary continuous random variables, each cumulative distribution function comprising functions of a full distribution of at least one of the one or more discrete random variables of the first latent space; determining an inversion of the ordered set of conditional cumulative distribution functions of the supplementary continuous random variables; constructing a first stochastic approximation to a lower bound on the log-likelihood of the at least a subset of a training dataset; constructing a second stochastic approximation to a gradient of the lower bound on the log-likelihood of at least the subset of the training dataset; and increasing the lower bound on the log-likelihood of at least the subset of the training dataset based at least in part on the gradient of the lower bound on the log-likelihood of at least the subset of the training dataset, wherein constructing a second stochastic approximation to a gradient of the lower bound includes approximating a gradient of at least a first part of the first stochastic approximation with respect to one or more parameters of the prior distribution over the first latent space using samples from the prior distribution, wherein approximating the gradient of at least a first part of the first stochastic approximation with respect to one or more parameters of the prior distribution over the first latent space using samples from the prior distribution includes at least one of generating a plurality of samples or causing a plurality of samples to be generated by a quantum processor comprising a plurality of qubits and a plurality of coupling devices providing communicative coupling between respective pairs of qubits, wherein at least one of generating a plurality of samples or causing a plurality of samples to be generated by a quantum processor includes: forming one or more chains, each chain comprising a respective subset of the plurality of qubits; and representing at least one of the one or more discrete random variables of the first latent space by a respective chain. 2. The method of claim 1 wherein forming one or more chains includes initiating a coupling strength of at least one coupling device, the coupling device selected to induce a correlation between a respective pair of qubits. 3. The method of claim 1 further comprising: determining an approximating posterior distribution over each of the first latent space and the second latent space, wherein the approximating posterior distribution is a hierarchical approximating posterior distribution comprising a plurality of levels of the hierarchy, and wherein at least one of generating a plurality of samples or causing a plurality of samples to be generated by a quantum processor further includes assigning a first qubit of each chain to a respective first level of the hierarchy. 4. The method of claim 3 wherein at least one of generating a plurality of samples or causing a plurality of samples to be generated by a quantum processor further includes assigning a second qubit of each chain to a respective second level of the hierarchy, wherein the second qubit of each chain is successively adjacent in the respective chain to the first qubit of the respective chain, and the likelihood of the second qubit of each chain for a given sample having the same value as the first qubit of the respective chain exceeds a predetermined threshold. 5. The method of claim 4 wherein at least one of generating a plurality of samples or causing a plurality of samples to be generated by a quantum processor further includes assigning a third qubit of each chain to a respective third level of the hierarchy, wherein the third qubit of each chain is successively adjacent in the respective chain to the second qubit of the respective chain, and the likelihood of the third qubit of each chain for a given sample having the same value as the second qubit of the respective chain exceeds the predetermined threshold. 6. The method of claim 4 wherein at least one of generating a plurality of samples or causing a plurality of samples to be generated by a quantum processor further includes assigning a third qubit of each chain to a respective third level of the hierarchy, wherein the third qubit of each chain is successively adjacent in the respective chain to the second qubit of the respective chain, and the likelihood of the third qubit of each chain for a given sample having the same value as the first qubit of the respective chain exceeds the predetermined threshold. 7. The method of claim 3 wherein the first qubit of each chain is at one end of the chain. 8. The method of claim 3 wherein the first qubit of each chain is in the interior of the chain, and wherein at least one of generating a plurality of samples or causing a plurality of samples to be generated by a quantum processor further includes assigning a second and a third qubit of each chain to a respective second level of the hierarchy, wherein the second and third qubits are both successively adjacent in the respective chain to the respective first qubit. 9. The method of claim 1 further comprising: determining an approximating posterior distribution over each of the first latent space and the second latent space, wherein the approximating posterior distribution is a hierarchical approximating posterior distribution comprising a plurality of levels of the hierarchy, wherein at least one of generating a plurality of samples or causing a plurality of samples to be generated by a quantum processor further includes assigning a single qubit of each chain to a respective level of the hierarchy. 10. The method of claim 1 further comprising: determining an approximating posterior distribution over each of the first latent space and the second latent space, wherein the approximating posterior distribution is a hierarchical approximating posterior distribution comprising a plurality of levels of the hierarchy, wherein at least one of generating a plurality of samples or causing a plurality of samples to be generated by a quantum processor further includes assigning a single qubit of each chain to a single level of the hierarchy. 11. The method of claim 1 further comprising: determining an approximating poster
Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
Probabilistic graphical models, e.g. probabilistic networks · CPC title
Probabilistic or stochastic networks · CPC title
Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.