Policy-based detection of anomalous control and data flow paths in an application program

US10902121B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10902121-B2
Application numberUS-201715788473-A
CountryUS
Kind codeB2
Filing dateOct 19, 2017
Priority dateOct 19, 2017
Publication dateJan 26, 2021
Grant dateJan 26, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Anomalous control and data flow paths in a program are determined by machine learning the program's normal control flow paths and data flow paths. A subset of those paths also may be determined to involve sensitive data and/or computation. Learning involves collecting events as the program executes, and associating those event with metadata related to the flows. This information is used to train the system about normal paths versus anomalous paths, and sensitive paths versus non-sensitive paths. Training leads to development of a baseline “provenance” graph, which is evaluated to determine “sensitive” control or data flows in the “normal” operation. This process is enhanced by analyzing log data collected during runtime execution of the program against a policy to assign confidence values to the control and data flows. Using these confidence values, anomalous edges and/or paths with respect to the policy are identified to generate a “program execution” provenance graph associated with the policy.

First claim

Opening claim text (preview).

Having described our invention, what we claim is as follows: 1. A method for detecting anomalous behavior of an application program, comprising: receiving trace data generated from multiple invocations of the application program; based at least in part on the received trace data, building a baseline provenance graph that models a normal control flow or data flow in the application program, and that identifies any path within a control flow or data flow that involves sensitive data or computation, wherein at least one edge in the baseline provenance graph has at least one or more first confidence values associated with a probability of that edge being traversed; during runtime execution of the application program against a policy, wherein the policy is one of: a security policy, a compliance policy and a network policy, receiving log data; using the received log data to assign second confidence values to at least one of the control or data flows with respect to the policy; and identifying that the at least one edge is anomalous by comparing the assigned second confidence values with the at least one or more first confidence values, the edge identified as anomalous representing the anomalous behavior; and responsive to detecting the anomalous behavior, taking a further corrective action. 2. The method as described in claim 1 further including building a program execution provenance graph associated with the policy and that includes the control or data flows and their assigned confidence values. 3. The method as described in claim 2 wherein the program execution provenance graph is built using machine learning. 4. The method as described in claim 3 further including adjusting a confidence value assigned to a given control or data flow based on the machine learning. 5. The method as described in claim 1 wherein the baseline provenance graph comprises a control flow graph, and a data flow graph. 6. The method as described in claim 2 further including reporting an application program behavior anomaly identified from the program execution provenance graph. 7. An apparatus for detecting anomalous behavior of an application program, comprising: a processor; computer memory holding computer program instructions executed by the processor, the computer program configured to: receive trace data generated from multiple invocations of the application program; based at least in part on the received trace data, build a baseline provenance graph that models a normal control flow or data flow in the application program, and that identifies any path within a control flow or data flow that involves sensitive data or computation, wherein at least one edge in the baseline provenance graph has at least one or more first confidence values associated with a probability of that edge being traversed; during runtime execution of the application program against a policy, wherein the policy is one of: a security policy, a compliance policy and a network policy, receive log data; use the received log data to assign second confidence values to at least one of the control or data flows with respect to the policy; and identify that the at least one edge is anomalous by comparing the assigned second confidence values with the at least one or more first confidence values, the edge identified as anomalous representing the anomalous behavior; and responsive to detecting the anomalous behavior, take a further corrective action. 8. The apparatus as described in claim 7 wherein the computer program instructions are further configured to build a program execution provenance graph associated with the policy and that includes the control or data flows and their assigned confidence values. 9. The apparatus as described in claim 8 wherein the program execution provenance graph is built using machine learning. 10. The apparatus as described in claim 9 wherein the computer program instructions are further configured to adjust a confidence value assigned to a given control or data flow based on the machine learning. 11. The apparatus as described in claim 7 wherein the baseline provenance graph comprises a control flow graph, and a data flow graph. 12. The apparatus as described in claim 8 wherein the computer program instructions are further configured to report an application program behavior anomaly identified from the program execution provenance graph. 13. A computer program product in a non-transitory computer readable medium for use in a data processing system for detecting anomalous behavior of an application program the computer program product holding computer program instructions that, when executed by the data processing system, are configured to: receive trace data generated from multiple invocations of the application program; based at least in part on the received trace data, build a baseline provenance graph that models a normal control flow or data flow in the application program, and that identifies any path within a control flow or data flow that involves sensitive data or computation, wherein at least one edge in the baseline provenance graph has at least one or more first confidence values associated with a probability of that edge being traversed; during runtime execution of the application program against a policy, wherein the policy is one of: a security policy, a compliance policy and a network policy, receive log data; use the received log data to assign second confidence values to at least one of the control or data flows with respect to the policy; identify that the at least one edge is anomalous by comparing the assigned second confidence values with the at least one or more first confidence values, the edge identified as anomalous representing the anomalous behavior; and responsive to detecting the anomalous behavior, take a further corrective action. 14. The computer program product as described in claim 13 wherein the computer program instructions are further configured to build a program execution provenance graph associated with the policy and that includes the control or data flows and their assigned confidence values. 15. The computer program product as described in claim 14 wherein the program execution provenance graph is built using machine learning. 16. The computer program product as described in claim 15 wherein the computer program instructions are further configured to adjust a confidence value assigned to a given control or data flow based on the machine learning. 17. The computer program product as described in claim 13 wherein the baseline provenance graph comprises a control flow graph, and a data flow graph. 18. The computer program product as described in claim 14 wherein the computer program instructions are further configured to report an application program behavior anomaly identified from the program execution provenance graph.

Assignees

Inventors

Classifications

  • Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title

  • Probabilistic graphical models, e.g. probabilistic networks · CPC title

  • Test or assess software · CPC title

  • Machine learning · CPC title

  • G06F21/566Primary

    Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10902121B2 cover?
Anomalous control and data flow paths in a program are determined by machine learning the program's normal control flow paths and data flow paths. A subset of those paths also may be determined to involve sensitive data and/or computation. Learning involves collecting events as the program executes, and associating those event with metadata related to the flows. This information is used to trai…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F21/566. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 26 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).