Techniques to detect perturbation attacks with an actor-critic framework

US11501001B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11501001-B2
Application numberUS-202016910722-A
CountryUS
Kind codeB2
Filing dateJun 24, 2020
Priority dateAug 14, 2018
Publication dateNov 15, 2022
Grant dateNov 15, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments discussed herein may be generally directed to systems and techniques to generate a quality score based on an observation and an action caused by an actor agent during a testing phase. Embodiments also include determining a temporal difference between the quality score and a previous quality score based on a previous observation and a previous action, determining whether the temporal difference exceeds a threshold value, and generating an attack indication in response to determining the temporal difference exceeds the threshold value.

First claim

Opening claim text (preview).

What is claimed is: 1. An apparatus, comprising: memory to store instructions; and processing circuitry coupled with the memory, the processing circuitry to: generate a quality score based on an observation and an action, the action performed in a processing environment based on the observation; determine a temporal difference between the quality score and a previous quality score based on a previous observation and a previous action; determine whether the temporal difference exceeds a threshold value; generate an attack indication when the temporal difference exceeds the threshold value; and permit processing of a next observation and a next action when the temporal difference does not exceed the threshold value. 2. The apparatus of claim 1 , wherein the attack indication to indicate an occurrence of an attack via an input in the processing environment, the attack comprising one or more of a Fast Gradient Sign Method (FGSM) attack and a random attack. 3. The apparatus of claim 1 , comprising an actor agent to cause a series of actions including the action and the previous action, and a critic agent to determine a sequence of quality scores based on each action of the series of actions and an associated observation for each action. 4. The apparatus of claim 3 , the critic agent to determine temporal differences between quality scores of consecutive actions of the series of actions and the associated observations. 5. The apparatus of claim 4 , the critic agent to: determine whether each of the temporal differences exceeds the threshold value; permit the actor agent to cause a next action in the processing environment in response to determining a temporal difference of the temporal differences does not exceed the threshold value; and generate an attack indication and prohibit the actor agent from causing a next action, in response to determining a temporal difference of the temporal differences exceeds the threshold value. 6. The apparatus of claim 1 , a critic agent to perform a training phase prior to generating the quality score and generating the quality score during a testing phase. 7. The apparatus of claim 6 , the critic agent to train an actor agent during the training phase using temporal difference learning. 8. The apparatus of claim 1 , comprising one or more sensor devices to generate data for the processing environment, the sensor devices comprising a camera, a laser range finder, a radio detection and ranging (RADAR) device, a global positioning system (GPS) device, an ultrasonic device, a sound detection and ranging (SONAR) device, an altimeter, a gyroscope, a tachymeter, or an accelerometer. 9. The apparatus of claim 1 , comprising a storage to store a sequence of quality scores including the quality score, each quality score of the sequence of quality scores utilized to determine a temporal difference. 10. A computer-implemented method, comprising: generating a quality score based on an observation and an action, the action performed in a processing environment based on the observation; determining a temporal difference between the quality score and a previous quality score based on a previous observation and a previous action; determining whether the temporal difference exceeds a threshold value; generating an attack indication when the temporal difference exceeds the threshold value; and permit processing of a next observation and a next action when the temporal difference does not exceed the threshold value. 11. The computer-implemented method of claim 10 , wherein the attack indication indicates an occurrence of an attack via an input in the processing environment, the attack comprising one or more of a Fast Gradient Sign Method (FGSM) attack and a random attack. 12. The computer-implemented method of claim 10 , comprising: causing a series of actions including the action and the previous action; and determining a sequence of quality scores based on each action of the series of actions and an associated observation for each action. 13. The computer-implemented method of claim 12 , comprising determining temporal differences between quality scores of consecutive actions of the series of actions and the associated observations. 14. The computer-implemented method of claim 13 , comprising: determining whether each of the temporal differences exceeds the threshold value; permitting a next action in the processing environment in response to determining a temporal difference of the temporal differences does not exceed the threshold value; and generating an attack indication and prohibiting a next action, in response to determining a temporal difference of the temporal differences exceeds the threshold value. 15. The computer-implemented method of claim 10 , comprising performing a training phase prior to generating the quality score and generating the quality score during a testing phase. 16. The computer-implemented method of claim 15 , comprising performing the training during the training phase using temporal difference learning. 17. The computer-implemented method of claim 10 , comprising receiving data from one or more sensor devices in the processing environment, the sensor devices comprising a camera, a laser range finder, a radio detection and ranging (RADAR) device, a global positioning system (GPS) device, an ultrasonic device, a sound detection and ranging (SONAR) device, an altimeter, a gyroscope, a tachymeter, or an accelerometer. 18. A non-transitory machine-readable medium containing instructions, which when executed by a processor, cause the processor to perform operations, the operations to: generate a quality score based on an observation and an action, the action performed in a processing environment based on the observation; determine a temporal difference between the quality score and a previous quality score based on a previous observation and a previous action; determine whether the temporal difference exceeds a threshold value; generate an attack indication when the temporal difference exceeds the threshold value; and permit processing of a next observation and a next action when the temporal difference does not exceed the threshold value. 19. The machine-readable medium of claim 18 , wherein the attack indication indicates an occurrence of an attack via an input in the processing environment, the attack comprising one or more of a Fast Gradient Sign Method (FGSM) attack and a random attack. 20. The machine-readable medium of claim 18 , wherein the operations further comprise operations to: cause a series of actions including the action and the previous action; and determine a sequence of quality scores based on each action of the series of actions and an associated observation for each action. 21. The machine-readable medium of claim 20 , wherein the operations further comprise operations to determine temporal differences between quality scores of consecutive actions of the series of actions and the associated observations. 22. The machine-readable medium of claim 21 , wherein the operations further comprise operations to: determine whether each of the temporal differences exceeds the threshold value; permit a next action in the processing environment in response to determining a temporal difference of the temporal differences does not exceed the threshold value; and generate an attack indication and prohibiting a next action, in response to determining a temporal difference of the temporal difference

Assignees

Inventors

Classifications

  • Combinations of networks · CPC title

  • Reinforcement learning · CPC title

  • involving long-term monitoring or reporting · CPC title

  • Test or assess a computer or a system · CPC title

  • G06F21/577Primary

    Assessing vulnerabilities and evaluating computer system security · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11501001B2 cover?
Embodiments discussed herein may be generally directed to systems and techniques to generate a quality score based on an observation and an action caused by an actor agent during a testing phase. Embodiments also include determining a temporal difference between the quality score and a previous quality score based on a previous observation and a previous action, determining whether the temporal…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06F21/577. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 15 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).