Iterative training of a machine learning model
US-2021326749-A1 · Oct 21, 2021 · US
US11816183B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11816183-B2 |
| Application number | US-202017119989-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 11, 2020 |
| Priority date | Dec 11, 2020 |
| Publication date | Nov 14, 2023 |
| Grant date | Nov 14, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods and systems for mining minority-class data samples are described. A minority-class mining service receives activations generated by an inner-layer of a client neural network that has been trained to perform a prediction task that involves classification. The minority-class mining service generates a recalibrated activation using a recalibration neural network, and generates an anomaly detector output using an anomaly detector. From the anomaly detector output, a minority-class score is computed for the data sample represented by a received activation. The computed minority-class score is compared against a minority-class threshold to identify a candidate minority-class data sample. The candidate minority-class data sample can then be labeled and added to the training dataset for the client neural network.
Opening claim text (preview).
The invention claimed is: 1. A method for identifying a candidate minority-class data sample, the method comprising: receiving an activation comprising values of an inner-layer activation representing a given data sample, the received activation being generated by a client neural network that has been trained to perform a classification; forward propagating the received activation through a trained recalibration neural network, to generate a recalibrated activation, wherein the trained recalibration neural network has been trained to perform the classification in a manner to avoid overtraining; forward propagating the recalibrated activation through a trained anomaly detector, wherein the trained anomaly detector has been trained on activations in which majority-class data samples form a majority; computing a minority-class score for the received activation, based on an anomaly detector output; identifying the given data sample as a candidate minority-class data sample, based on a comparison of the minority-class score against a minority-class threshold; and communicating an identification of the given data sample as the candidate minority-class data sample. 2. The method of claim 1 , wherein there is a plurality of received activations representing the given data sample, and for each respective received activation of the plurality of received activations the method comprises: forward propagating the respective received activation through the trained recalibration neural network, to generate a respective recalibrated activation; forward propagating the respective recalibrated activation through the trained anomaly detector, to generate a respective anomaly detector output; and computing a respective minority-class score for the respective received activation, based on the respective anomaly detector output; the method further comprising: filtering and aggregating the respective minority-class scores computed for the plurality of received activations to obtain a single minority-class score to be used in the comparison against the minority-class threshold. 3. The method of claim 1 , wherein the trained anomaly detector is a trained autoencoder that has been trained to output a reconstructed activation as the anomaly detector output, and wherein the minority-class score is computed based on a quality of the reconstructed activation. 4. The method of claim 1 , wherein the received activation is received from a client computing system, and wherein the minority-class threshold is received from the client computing system. 5. The method of claim 1 , wherein the identification of the given data sample as the candidate minority-class data sample is communicated to the client computing system. 6. The method of claim 1 , wherein the identification of the given data sample as the candidate minority-class data sample is communicated to a labeling service. 7. The method of claim 1 , wherein the trained anomaly detector is a trained autoencoder that has been trained to perform a reconstruction task, wherein the anomaly detector output is a reconstructed activation, and wherein computing the minority-class score comprises: computing a mean square error between the received activation and the reconstructed activation, wherein the computed mean square error is used as the minority-class score. 8. The method of claim 7 , wherein a softmax function is applied to the received activation and to the reconstructed activation, prior to computing the mean square error. 9. The method of claim 1 , further comprising training the recalibration neural network and the anomaly detector, wherein the anomaly detector is an autoencoder, by: receiving a set of inner-layer activations generated by the client neural network, and a set of corresponding class labels, each class label being associated with a respective inner-layer activation; training the recalibration neural network using a subset of training activations, from the set of inner-layer activations, by: for each training activation, forward propagating the training activation through the recalibration neural network to generate a predicted class label; computing a focal loss using the predicted class label, the corresponding class label associated with the training activation, and a focal loss function; and updating weights of the recalibration neural network by backpropagating the computed focal loss; training the autoencoder using a set of recalibrated training activations generated by the recalibration neural network from the subset of training activations, by: for each recalibrated training activation, forward propagating the recalibrated training activation through the autoencoder to generate a reconstructed training activation; computing a reconstruction loss using the reconstructed training activation, the recalibrated training activation, and a reconstruction loss function; and updating weights of the autoencoder by backpropagating the computed reconstruction loss. 10. The method of claim 9 , wherein training of the recalibration neural network is performed for a reduced number of epochs compared to training of the client neural network. 11. The method of claim 9 , further comprising computing the minority-class threshold by: forward propagating a subset of validation activations, from the set of inner-layer activations, through the trained recalibration neural network and the trained autoencoder to obtain a set of reconstructed validation activations; computing a set of minority-class scores based on quality of reconstruction of the set of reconstructed validation activations; pairing each minority-class score with a corresponding class label; and identifying, from the pairings, a numerical value for the minority-class threshold representing a boundary between the minority-class score for a minority-class data sample and the minority-class score for a majority-class data sample. 12. The method of claim 11 , wherein the computed minority-class threshold is communicated to a client computing system. 13. A computing system for identifying a candidate minority-class data sample, the computing system comprising: a processing device configured to execute instructions to cause the computing system to: receive an activation comprising values of an inner-layer activation representing a given data sample, the received activation being generated by a client neural network that has been trained to perform a classification; forward propagate the received activation through a trained recalibration neural network, to generate a recalibrated activation, wherein the trained recalibration neural network has been trained to perform the classification in a manner to avoid overtraining; forward propagate the recalibrated activation through a trained anomaly detector, wherein the trained anomaly detector has been trained on activations in which majority-class data samples form a majority; compute a minority-class score for the received activation, based on an anomaly detector output; identify the given data sample as a candidate minority-class data sample, based on a comparison of the minority-class score against a minority-class threshold; and communicate an identification of the given data sample as the candidate minority-class data sample. 14. The computing system of claim 13 , wherein there is a plurality of received activations representing the given data sample, and the instructions cause the computing system to, for each respective received activation of the plurality of received activations: forward propagate the respective received activation through the trained recali
Feedforward networks · CPC title
Supervised learning · CPC title
Auto-encoder networks; Encoder-decoder networks · CPC title
characterised by the process organisation or structure, e.g. boosting cascade · CPC title
characterised by the incorporation of unlabelled data, e.g. multiple instance learning [MIL], semi-supervised techniques using expectation-maximisation [EM] or naïve labelling · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.