Learning input preprocessing to harden machine learning models
US-2021073376-A1 · Mar 11, 2021 · US
US2021224425A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2021224425-A1 |
| Application number | US-202117153258-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jan 20, 2021 |
| Priority date | Jan 21, 2020 |
| Publication date | Jul 22, 2021 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
This disclosure is directed to a generalizable machine learning model production environment and system with a defense mechanism that facilitates safe execution of machine learning models in production by effectively detecting potential known and new adversarial attacks. The disclosed exemplary systems and architectures gather data from the online execution of the machine learning models and communicate with an on-demand pipelines for further inspection and/or correction of vulnerabilities in the production machine learning model to the detected attacks. These systems and architectures provide an automatable process for continuous monitoring of model performance and correction of the production machine learning model to guard against current and future adversarial attacks.
Opening claim text (preview).
What is claimed is: 1 . A machine learning model production system for a main machine learning model comprising a circuitry, the circuitry configured to: receive a live data item; configure an online pipeline of the machine learning model production system to: determine, using a detection engine of the online pipeline, whether the live data item is adversarial; execute the main machine learning model to generate a main prediction for the live data item when the detection engine determines that the live data item is not adversarial; and store the live data item in a data store for adversarial data items and trigger execution of an on-demand pipeline of the machine learning model production system for inspection of the live data item and for evaluation and correction of the main machine learning model when it is determined that the live data item is adversarial; and configure the on-demand pipeline of the machine learning model production system to, upon being triggered by the online pipeline: retrain an updated machine learning model; and update the online pipeline with the updated machine learning model. 2 . The machine learning model production system of claim 1 , wherein the online pipeline further comprises a consistency engine, wherein, when the detection engine determines that the live data item does not contain any adversarial attack, the circuitry is further configured to configure the consistency engine to: execute a plurality of proxy machine learning models to generate a plurality of proxy predictions for the live data item; analyze the main prediction and the plurality of proxy predictions to generate a prediction consistency measure among the main machine learning model and the proxy machine learning models; and determine that the live data item is adversarial when the prediction consistency measure is below a predetermined consistency threshold. 3 . The machine learning model production system of claim 2 , wherein the plurality of proxy machine learning models comprises one or more model architectures. 4 . The machine learning model production system of claim 2 , wherein to update the online pipeline with the updated machine learning model comprises updating one of the plurality of proxy machine learning models with the updated machine learning model. 5 . The machine learning model production system of claim 1 , wherein: online pipeline further comprises a data transformer; when the detection engine determines that the live data item does not contain any adversarial attack, the circuitry is further configured to transform the live data item; and the main prediction is generated by the main machine learning model using the transformed live data item. 6 . The machine learning model production system of claim 5 , wherein the data transformer comprises an ensemble of data processing models each performing an independent type of data transformation. 7 . The machine learning model production system of claim 5 , wherein the data transformer is configured to remove potential adversarial features in the live data item. 8 . The machine learning model production system of claim 1 , wherein the detection engine comprises an ensemble of detection models for detecting distinct types of adversarial attacks. 9 . The machine learning model production system of claim 1 , wherein to update the online pipeline with the updated machine learning model comprises updating the main machine learning model with the updated machine learning model. 10 . The machine learning model production system of claim 1 , wherein to retrain the updated machine learning model comprises training the updated machine learning model using a training dataset that excludes the adversarial data items in the data store for adversarial data items. 11 . The machine learning model production system of claim 1 , wherein the circuitry is further configured to cause the on-demand pipeline to inspect the live data item to generate an adversarial label, and retrain the updated machine learning model using a training dataset including the live data item with the adversarial label. 12 . A method for protecting a main machine learning model in a machine learning model production system against adversarial attacks, comprising: receiving a live data item; determine, using a detection engine of an online pipeline, whether the live data item is adversarial; executing the main machine learning model to generate a main prediction for the live data item when the detection engine determines that the live data item is not adversarial; storing the live data item in a data store for adversarial data items and triggering execution of an on-demand pipeline of the machine learning model production system for inspection of the live data item and for evaluation and correction of the main machine learning model when it is determined that the live data item is adversarial; retraining an updated machine learning model in the on-demand pipeline; and updating the online pipeline with the updated machine learning model. 13 . The method of claim 12 , further comprising: executing a plurality of proxy machine learning models to generate a plurality of proxy predictions for the live data item; analyzing the main prediction and the plurality of proxy predictions to generate a prediction consistency measure among the main machine learning model and the proxy machine learning models; and determining that the live data item is adversarial when the prediction consistency measure is below a predetermined consistency threshold. 14 . The method of claim 13 , wherein the plurality of proxy machine learning models comprises at least two different model architectures. 15 . The method of claim 13 , wherein updating the online pipeline with the updated machine learning model comprises updating one of the plurality of proxy machine learning models with the updated machine learning model. 16 . The method of claim 12 , further comprising: when it is determined that the live data item does not contain any adversarial attack, transforming the live data item; and generating the main prediction by processing the transformed live data item using the main machine learning model. 17 . The method of claim 16 , wherein transforming the live data item comprises transforming the live data item using an ensemble of data processing models each performing an independent type of data transformation. 18 . The method of claim 16 , wherein transforming the live data item further comprises removing potential adversarial features in the live data item even when the detection engine determines that the live data item does not contain any adversarial attack. 19 . The method of claim 12 , wherein the detection engine comprises an ensemble of detection models for detecting distinct types of adversarial attacks. 20 . The method of claim 12 , wherein retraining the updated machine learning model comprises training the updated machine learning model using a training dataset that excludes the adversarial data items in the data store for adversarial data items.
Related publications grouped by family.
Answers are generated from the same data shown on this page.