Gradient adversarial training of neural networks
US-2021192357-A1 · Jun 24, 2021 · US
US11836249B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11836249-B2 |
| Application number | US-201916691307-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 21, 2019 |
| Priority date | Nov 21, 2019 |
| Publication date | Dec 5, 2023 |
| Grant date | Dec 5, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Aspects of the present disclosure involve systems, methods, devices, and the like for generating an adversarially resistant model. In one embodiment, a novel architecture is presented that enables the identification of an image that has been adversarially attacked. The system and method used in the identification introduce the use of a denoising module used to reconstruct the original image from the modified image received. Then, further to the reconstruction, an adversarially trained model is used to make a prediction using at least a determination of a loss that may exist between the original image and the denoised image.
Opening claim text (preview).
What is claimed is: 1. A system comprising: a non-transitory memory storing instructions; and a processor configured to execute the instructions to cause the system to perform operations comprising: receiving, via a wireless network communication, a request for an adversarial attack detection, the request including a modified image; identifying, by a machine learning model from within the modified image, whether there is a perturbation from the modified image, wherein the identifying comprises: generating, by an autoencoder, a denoised image, processing, by the machine learning model, the denoised image to determine a reconstruction loss associated with the request for the adversarial attack detection, and determining, by the machine learning model, whether a measurement of the reconstruction loss meets threshold criteria indicating an adversarial attack from the perturbation in the modified image; determining a prediction of an original image of the modified image, the prediction determined based in part on the reconstruction loss and the measurement of the reconstruction loss determined by the machine learning model, and the prediction including a predicted action based in part on the modified image and the original image; and executing the predicted action for the original image. 2. The system of claim 1 , wherein the operations further comprise: feeding back the reconstruction loss determined by the machine learning model to the autoencoder. 3. The system of claim 1 , wherein the modified image includes the original image having the perturbation added to the original image that causes the adversarial attack when a machine processes the modified image, and wherein the adversarial attack causes the machine to produce an erroneous action. 4. The system of claim 1 , wherein the machine learning model is an adversarial trained deep learning model. 5. The system of claim 4 , wherein the adversarial trained deep learning model is trained using adversarially attacked images. 6. The system of claim 1 , wherein a determination that the measurement of reconstruction loss does not meet the threshold criteria indicates the modified image is the original image. 7. The system of claim 1 , wherein generating the denoised image includes reconstructing the original image using an encoder and a decoder. 8. A method comprising: receiving a request to determine an action on a received image; determining the received image is adversarially attacked, the determining including: determining by a denoiser processing the received image, a perturbation in the received image that is adversarially attacked, wherein the determining the perturbation comprises determining, by a machine learning model trained using adversarially attacked images, loss level information associated with the received image from the perturbation in the received image, wherein the loss level information indicates that the received image is adversarially attacked using the perturbation based on a loss threshold processing, by the machine learning model, the received image and the loss level information associated with the perturbation from the received image; making a prediction of an original image from the received image, the prediction determined based on the processing, and the prediction including a predicted action based in part on the received image and the original image; and executing the predicted action for the original image. 9. The method of claim 8 , wherein the determining that the received image received is adversarially attacked indicates the original image is modified by noise associated with the perturbation. 10. The method of claim 8 , wherein the prediction includes a result with a greater confidence than a previous confidence identifying whether the received image is adversarially attacked. 11. The method of claim 8 , further comprising determining, by the denoiser, a code of the received image. 12. The method of claim 11 , wherein the determining the code includes determining a node reduction of a network by an encoder. 13. The method of claim 11 , wherein the denoiser includes a decoder for reconstructing the original image from the code determined. 14. The method of claim 8 , wherein the prediction is based in part on a loss detected. 15. A non-transitory machine-readable medium having stored thereon machine-readable instructions executable to cause a machine to perform operations comprising: receiving, via a wireless network communication, a request for an adversarial attack detection, the request including a modified image; identifying, by a machine learning model from within the modified image, whether there is a perturbation from the modified image, wherein the identifying comprises: generating, by an autoencoder, a denoised image, processing by the machine learning model, the denoised image to determine a loss associated with the request for the adversarial attack detection, and determining, by the machine learning model, whether a loss level of a reconstruction loss meets a threshold criteria indicating an adversarial attack from the perturbation in the modified image; determining a prediction based in part on the reconstruction loss and the loss level determined by the machine learning model, wherein the determining the prediction comprises: executing a feedback loop that removes the perturbation from the modified image using an iterative processing of the modified image and an original image, predicting, using the machine learning model and the executed feedback loop, the original image of the modified image, and determining, based on the original image, a predicted action for the original image; and executing the predicted action for the original image. 16. The non-transitory machine-readable medium of claim 15 , wherein the operations further comprise: feeding back the reconstruction loss determined by the machine learning model to the autoencoder. 17. The non-transitory machine-readable medium of claim 15 , wherein the modified image includes the original image with the perturbation in a portion of the original image, and wherein the perturbation comprises a machine-readable code that is added to the portion of the original image so that the perturbation is not visible to a human eye. 18. The non-transitory machine-readable medium of claim 15 , wherein the machine learning model is an adversarial trained deep learning model. 19. The non-transitory machine-readable medium of claim 18 , wherein the adversarial trained deep learning model is trained using adversarially attacked images. 20. The non-transitory machine-readable medium of claim 15 , wherein a determination that the loss level does not meet the threshold criteria indicates the modified image is the original image.
Auto-encoder networks; Encoder-decoder networks · CPC title
Adversarial learning · CPC title
Supervised learning · CPC title
Feedforward networks · CPC title
modifying the architecture, e.g. adding, deleting or silencing nodes or connections · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.