Detection of an adversarial backdoor attack on a trained model at inference time

US11601468B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11601468-B2
Application numberUS-201916451110-A
CountryUS
Kind codeB2
Filing dateJun 25, 2019
Priority dateJun 25, 2019
Publication dateMar 7, 2023
Grant dateMar 7, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems, computer-implemented methods, and computer program products that can facilitate detection of an adversarial backdoor attack on a trained model at inference time are provided. According to an embodiment, a system can comprise a memory that stores computer executable components and a processor that executes the computer executable components stored in the memory. The computer executable components can comprise a log component that records predictions and corresponding activation values generated by a trained model based on inference requests. The computer executable components can further comprise an analysis component that employs a model at an inference time to detect a backdoor trigger request based on the predictions and the corresponding activation values. In some embodiments, the log component records the predictions and the corresponding activation values from one or more layers of the trained model.

First claim

Opening claim text (preview).

What is claimed is: 1. A system, comprising: a memory that stores computer executable components; and a processor that executes the computer executable components stored in the memory, wherein the computer executable components comprise: a log component that records predictions and corresponding activation values generated by a trained model based on inference requests; and an analysis component that employs a model at an inference time to detect a backdoor trigger request based on the predictions and the corresponding activation values, wherein the model is selected from a group consisting of a clustering model, an activation clustering model, a heuristic model, an outlier detector model, a trained outlier detector model, a local outlier factor model, a trained local outlier factor model, a novelty detector model, and a trained one class support vector machine model. 2. The system of claim 1 , wherein the log component records the predictions and the corresponding activation values from one or more layers of the trained model. 3. The system of claim 1 , wherein: the trained model is selected from a second group consisting of a trained artificial intelligence model, a trained machine learning model, a trained deep learning model, and a trained neural network model. 4. The system of claim 1 , wherein the computer executable components further comprise: a verification component that verifies authenticity of at least one of: one or more of the inference requests; one or more of the predictions; or one or more of the corresponding activation values. 5. The system of claim 1 , wherein the computer executable components further comprise: a trainer component that trains the model based on at least one of: one or more of the inference requests; one or more of the predictions and one or more of the corresponding activation values; one or more verified inference requests; or one or more verified predictions and one or more verified corresponding activation values. 6. The system of claim 1 , wherein the computer executable components further comprise: an intercept component that intercepts an inference request submitted to the trained model and extracts from the trained model at least one of a prediction or one or more corresponding activation values generated in at least one layer of the trained model based on the inference request. 7. The system of claim 1 , wherein the computer executable components further comprise: an action component that deactivates the trained model based on a detected backdoor trigger request. 8. The system of claim 1 , wherein the analysis component employs the model at the inference time to detect the backdoor trigger request based on the predictions and the corresponding activation values to facilitate at least one of: improved backdoor trigger request detection accuracy of the model; or reduced computational cost of a processing unit associated with the model. 9. A computer-implemented method, comprising: recording, by a system operatively coupled to a processor, predictions and corresponding activation values generated by a trained model based on inference requests; and employing, by the system, a model at an inference time to detect a backdoor trigger request based on the predictions and the corresponding activation values, wherein the model is selected from a group consisting of a clustering model, an activation clustering model, a heuristic model, an outlier detector model, a trained outlier detector model, a local outlier factor model, a trained local outlier factor model, a novelty detector model, and a trained one class support vector machine model. 10. The computer-implemented method of claim 9 , wherein the recording comprises: recording, by the system, the predictions and the corresponding activation values from one or more layers of the trained model. 11. The computer-implemented method of claim 9 , wherein: the trained model is selected from a second group consisting of a trained artificial intelligence model, a trained deep learning model, and a trained neural network model. 12. The computer-implemented method of claim 9 , further comprising: verifying, by the system, authenticity of at least one of: one or more of the inference requests; one or more of the predictions; or one or more of the corresponding activation values. 13. The computer-implemented method of claim 9 , further comprising: training, by the system, the model based on at least one of: one or more of the inference requests; one or more of the predictions and one or more of the corresponding activation values; one or more verified inference requests; or one or more verified predictions and one or more verified corresponding activation values. 14. The computer-implemented method of claim 9 , further comprising: intercepting, by the system, an inference request submitted to the trained model; and extracting, by the system, from the trained model at least one of a prediction or one or more corresponding activation values generated in at least one layer of the trained model based on the inference request. 15. The computer-implemented method of claim 9 , further comprising: deactivating, by the system, the trained model based on a detected backdoor trigger request. 16. A computer program product facilitating detection of an adversarial backdoor attack on a trained model at inference time, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to: record, by the processor, predictions and corresponding activation values generated by a trained model based on inference requests; and employ, by the processor, a model at an inference time to detect a backdoor trigger request based on the predictions and the corresponding activation values, wherein the model is selected from a group consisting of a clustering model, an activation clustering model, a heuristic model, an outlier detector model, a trained outlier detector model, a local outlier factor model, a trained local outlier factor model, a novelty detector model, and a trained one class support vector machine model. 17. The computer program product of claim 16 , wherein the program instructions are further executable by the processor to cause the processor to: record, by the processor, the predictions and the corresponding activation values from one or more layers of the trained model. 18. The computer program product of claim 16 , wherein: the trained model is selected from a second group consisting of a trained artificial intelligence model, a trained machine learning model, a trained deep learning model, and a trained neural network model. 19. The computer program product of claim 16 , wherein the program instructions are further executable by the processor to cause the processor to: train, by the processor, the model based on at least one of: one or more of the inference requests; one or more of the predictions and one or more of the corresponding activation values; one or more verified inference requests; or one or more verified predictions and one or more verified corresponding activation values. 20. The computer program product of claim 16 , wherein the program instructions are further executable by the processor to cause the processor to: intercept, by the processor, an inference request submitted to the trained model; and extract, by the processor, from the trained model at least one of a

Assignees

Inventors

Classifications

  • Detecting local intrusion or implementing counter-measures · CPC title

  • H04L63/145Primary

    the attack involving the propagation of malware through the network, e.g. viruses, trojans or worms · CPC title

  • Clustering techniques · CPC title

  • Inference or reasoning models · CPC title

  • G06F18/214Primary

    Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11601468B2 cover?
Systems, computer-implemented methods, and computer program products that can facilitate detection of an adversarial backdoor attack on a trained model at inference time are provided. According to an embodiment, a system can comprise a memory that stores computer executable components and a processor that executes the computer executable components stored in the memory. The computer executable …
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification H04L63/145. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Mar 07 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).