Encrypting data for storage in a dispersed storage network
US-9380032-B2 · Jun 28, 2016 · US
US10318882B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10318882-B2 |
| Application number | US-201414484201-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 11, 2014 |
| Priority date | Sep 11, 2014 |
| Publication date | Jun 11, 2019 |
| Grant date | Jun 11, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An indication of a data source to be used to train a linear prediction model is obtained. The model is to generate predictions using respective parameters assigned to a plurality of features derived from observation records of the data source. The parameter values are stored in a parameter vector. During a particular learning iteration of the training phase of the model, one or more features for which parameters are to be added to the parameter vector are identified. In response to a triggering condition, parameters for one or more features are removed from the parameter vector based on an analysis of relative contributions of the features represented in the parameter vector to the model's predictions. After the parameters are removed, at least one parameter is added to the parameter vector.
Opening claim text (preview).
What is claimed is: 1. A system, comprising: one or more computing devices configured to: receive, at a machine learning service of a provider network, an indication of a data source to be used for generating a linear prediction model, wherein, to generate a prediction, the linear prediction model is to utilize respective weights assigned to individual ones of a plurality of features derived from observation records of the data source, wherein the respective weights are stored in a parameter vector of the linear prediction model and updated in-memory during a machine training phase of the linear prediction model; determine, based at least in part on examination of a particular set of observation records of the data source, respective weights for one or more features to be added to the parameter vector during a particular learning iteration of a plurality of learning iterations of the training phase of the linear prediction model, wherein the addition increases memory consumption during the machine training phase; check, during one or more of the plurality of learning iterations, for a triggering condition to prune the parameter vector; in response to a determination that the triggering condition has been met during the training phase, identify one or more pruning victims from a set of features whose weights are included in the parameter vector, based at least in part on a quantile analysis of the weights, wherein the quantile analysis is performed without a sort operation; and remove at least a particular weight corresponding to a particular pruning victim of the one or more pruning victims from the parameter vector, wherein the removal reduces memory consumption during the training phase; and generate, during a post-training-phase prediction run of the linear prediction model, a prediction using at least one feature for which a weight is determined after the particular weight of the particular pruning victim is removed from the parameter vector. 2. The system as recited in claim 1 , wherein the triggering condition is based at least in part on a population of the parameter vector. 3. The system as recited in claim 1 , wherein the triggering condition is based at least in part on a goal indicated by a client. 4. The system as recited in claim 1 , wherein the one or more computing devices are further configured to: during a subsequent learning iteration of the plurality of learning iterations, performed after the particular learning iteration, determine that a weight for the particular pruning victim is to be re-added to the parameter vector; and add the weight corresponding to the particular pruning victim to the parameter vector. 5. The system as recited in claim 1 , wherein a first feature of the one or more features whose weights are to be added to the parameter vector during the particular learning iteration is derived from one or more variables of the observation records of the data source via a transformation that comprises a use of one or more of: (a) a quantile bin function, (b) a Cartesian product function, (c) a bi-gram function, (d) an n-gram function, (e) an orthogonal sparse bigram function, (f) a calendar function, (g) an image processing function, (h) an audio processing function, (i) a bio-informatics processing function, (j) a natural language processing function or (k) a video processing function. 6. A method, comprising: performing, by one or more computing devices: receiving an indication of a data source to be used for training a machine learning model, wherein, to generate a prediction, the machine learning model is to utilize respective parameters assigned to individual ones of a plurality of features derived from observation records of the data source, wherein the respective parameters are stored in a parameter vector of the machine learning model and updated in-memory during a training phase of the machine learning model; identifying one or more features for which respective parameters are to be added to the parameter vector during a particular learning iteration of a plurality of learning iterations of the training phase of the machine learning model, wherein the addition increases memory consumption during the training phase; checking, during one or more of the plurality of learning iterations, for a triggering condition to prune the parameter vector; in response to determining that the triggering condition has been met in the training phase, removing respective parameters of one or more pruning victim features from the parameter vector, wherein the removal reduces memory consumption during the training phase, and wherein the one or more pruning victim features are selected based at least in part on an analysis of relative contributions of features whose parameters are included in the parameter vector to predictions made using the machine learning model; and generating, during a post-training-phase prediction run of the machine learning model, a particular prediction using at least one feature for which a parameter is determined after the one or more pruning victim features are selected. 7. The method as recited in claim 6 , wherein the analysis of relative contributions comprises a quantile analysis of weights included in the parameter vector. 8. The method as recited in claim 6 , wherein the analysis of relative contributions (a) does not comprise a sort operation and (b) does not comprise copying values of the parameters included in the parameter vector. 9. The method as recited in claim 6 , wherein said determining that the triggering condition has been met comprises determining that a population of the parameter vector exceeds a threshold. 10. The method as recited in claim 6 , wherein the triggering condition is based at least in part on a resource capacity constraint of a server of a machine learning service. 11. The method as recited in claim 6 , wherein the triggering condition is based at least in part on a goal indicated by a client. 12. The method as recited in claim 6 , further comprising performing, by the one or more computing devices: during a subsequent learning iteration of the plurality of learning iterations, performed after the particular learning iteration, determining that a parameter for a particular feature which was previously selected as a pruning victim feature is to be re-added to the parameter vector; and adding the parameter for the particular feature to the parameter vector. 13. The method as recited in claim 6 , wherein a first feature of the one or more features for which respective parameters are to be added to the parameter vector during the particular learning iteration is determined from one or more variables of observation records of the data source via a transformation that comprises a use of one or more of: (a) a quantile bin function, (b) a Cartesian product function, (c) a bi-gram function, (d) an n-gram function, (e) an orthogonal sparse bigram function, (f) a calendar function, (g) an image processing function, (h) an audio processing function, (i) a bio-informatics processing function, (j) a natural language processing function, or (k) a video processing function. 14. The method as recited in claim 6 , further comprising performing, by the one or more computing devices: implementing a stochastic gradient descent technique to update, during the particular learning iteration, one or more previously-generated parameters included in the parameter vector. 15. The method as recited in claim 6 , wherein the machine learning model comprises a generalized linear model. 16. The method as recited in claim 6 , furth
Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
Probabilistic graphical models, e.g. probabilistic networks · CPC title
based on the proximity to a decision surface, e.g. support vector machines · CPC title
Physics · mapped topic
in which an application is distributed across nodes in the network (software deployment G06F8/60; multiprogramming arrangements G06F9/46) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.