Systems and methods for improved adversarial training of machine-learned models
US-11494667-B2 · Nov 8, 2022 · US
US12524677B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12524677-B2 |
| Application number | US-202117157077-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 25, 2021 |
| Priority date | Jan 25, 2021 |
| Publication date | Jan 13, 2026 |
| Grant date | Jan 13, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A trained machine learning model and a training dataset used to train the trained machine learning model can be received. Based on the training dataset, unsupervised adversarial examples can be generated. Robustness of the trained machine learning model can be determined using the generated unsupervised adversarial examples. The training dataset can be augmented with the generated unsupervised adversarial examples. The trained machine learning model can be retrained using the augmented training dataset.
Opening claim text (preview).
What is claimed is: 1 . A computer-implemented method comprising: receiving a trained machine learning model and a training dataset used to train the trained machine learning model; based on the training dataset and a loss function of the trained machine learning model, generating images representing adversarial examples, the adversarial examples being perturbed samples of the training dataset, the generating creating an adversarial example which is least similar to an original sample in the training dataset, and also satisfying an adversarial criterion that a loss associated with the adversarial example is less than that of the original sample in the training dataset; determining robustness of the trained machine learning model using the generated adversarial examples; augmenting the training dataset with the generated images representing adversarial examples; retraining the trained machine learning model using the augmented training dataset; performing by the retrained machine learning model, image classification; and displaying the original sample, a generated image representing the adversarial example, and a reconstructed image reconstructed using the retrained machine learning model. 2 . The method of claim 1 , wherein the adversarial example is randomly sampled. 3 . The method of claim 1 , wherein the adversarial example is sampled using an output of a convolutional layer of the trained machine learning model. 4 . The method of claim 1 , wherein the generating of the adversarial examples includes solving a minmax algorithm which finds the adversarial example that has minimum training loss and least similarity to the original sample. 5 . The method of claim 1 , wherein the trained machine learning model includes a neural network model. 6 . The method of claim 1 , wherein the trained machine learning model includes an autoencoder. 7 . The method of claim 1 , wherein the trained machine learning model includes a representation learning model. 8 . The method of claim 1 , wherein the trained machine learning model includes a contrastive learning model. 9 . The method of claim 1 , wherein the trained machine learning model includes an unsupervised machine learning model. 10 . A computer program product comprising a non-transitory computer readable storage medium having program instructions embodied therewith, the program instructions readable by a device to cause the device to: receive a trained machine learning model and a training dataset used to train the trained machine learning model; based on the training dataset and a loss function of the trained machine learning model, generate mages representing adversarial examples, the adversarial examples being perturbed samples of the training dataset, the generating creating an adversarial example which is least similar to an original sample in the training dataset, and also satisfying an adversarial criterion that a loss associated with the adversarial example is less than that of the original sample in the training dataset; augment the training dataset with the generated images representing adversarial examples; retrain the trained machine learning model using the augmented training dataset; perform by the retrained machine learning model, image classification; and display the original sample, a generated image representing the adversarial example, and a reconstructed image reconstructed using the retrained machine learning model. 11 . The computer program product of claim 10 , wherein the device is caused to determine robustness of the trained machine learning model using the generated adversarial examples. 12 . The computer program product of claim 10 , wherein the adversarial sample is randomly sampled. 13 . The computer program product of claim 10 , wherein the adversarial sample is sampled using an output of a convolutional layer of the trained machine learning model. 14 . The computer program product of claim 10 , wherein the generating adversarial examples includes solving a minmax algorithm which finds the adversarial example that has minimum training loss and least similarity to the original sample. 15 . A system comprising: a hardware processor; and a memory device coupled with the hardware processor; the hardware processor configured to: receive a trained machine learning model and a training dataset used to train the trained machine learning model; based on the training dataset and a loss function of the trained machine learning model, generate images representing adversarial examples, the adversarial examples being perturbed samples of the training dataset, the generating creating an adversarial example which is least similar to an original sample in the training dataset, and also satisfying an adversarial criterion that a loss associated with the adversarial example is less than that of the original sample in the training dataset; determine robustness of the trained machine learning model using the generated adversarial examples; augment the training dataset with the generated images representing adversarial examples; retrain the trained machine learning model using the augmented training dataset; perform by the retrained machine learning model, image classification; and display the original sample, a generated image representing the adversarial example, and a reconstructed image reconstructed using the retrained machine learning model.
Combinations of networks · CPC title
Adversarial learning · CPC title
Weakly supervised learning, e.g. semi-supervised or self-supervised learning · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
Auto-encoder networks; Encoder-decoder networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.