Recurrent neural network based anomaly detection

US11301563B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11301563-B2
Application numberUS-201916351718-A
CountryUS
Kind codeB2
Filing dateMar 13, 2019
Priority dateMar 13, 2019
Publication dateApr 12, 2022
Grant dateApr 12, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Mechanisms are provided for detecting abnormal system call sequences in a monitored computing environment. The mechanisms receive, from a computing system resource of the monitored computing environment, a system call of an observed system call sequence for evaluation. A trained recurrent neural network (RNN), trained to predict system call sequences, processes the system call to generate a prediction of a subsequent system call in a predicted system call sequence. Abnormal call sequence logic compares the subsequent system call in the predicted system call sequence to an observed system call in the observed system call sequence and identifies a difference between the predicted system call sequence and the observed system call sequence based on results of the comparing. The abnormal call sequence logic generates an alert notification in response to identifying the difference.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for detecting abnormal system call sequences in a monitored computing environment, the method comprising: receiving, from a computing system resource of the monitored computing environment, a system call of an observed system call sequence for evaluation; processing, by a trained recurrent neural network (RNN) trained to predict system call sequences, the system call to generate a prediction of a subsequent system call in a predicted system call sequence; comparing, by abnormal call sequence logic, the subsequent system call in the predicted system call sequence to an observed system call in the observed system call sequence; identifying, by the abnormal call sequence logic, a difference between the predicted system call sequence and the observed system call sequence based on results of the comparing; and generating, by the abnormal call sequence logic, an alert notification in response to identifying the difference, wherein processing the system call comprises converting the system call into a vector representation of the system call by performing a first embedding operation on a system call feature of the system call and a separate second embedding operation on one or more argument features of the system call to generate a system call feature embedding comprising machine learned embedding values and one or more argument feature embeddings comprising machine learned embedding values. 2. The method of claim 1 , wherein processing the system call further comprises inputting the vector representation of the system call into a long short term memory (LSTM) cell such that the RNN generates, for each system call feature of a plurality of system call features, and each argument feature of a plurality of argument features, probabilities that the corresponding system call feature or the corresponding argument feature is part of a subsequent system call in the predicted system call sequence. 3. The method of claim 2 , wherein the prediction of the subsequent system call is generated at least by: generating a plurality of combinations of system call features and argument features from the plurality of system call features and plurality of argument features and, for each combination in the plurality of combinations, combining probabilities of each system call feature and each argument feature of the combination to generate a probability for the combination; and selecting a combination from the plurality of combinations to represent the predicted subsequent system call based on the combined probabilities for the combinations in the plurality of combinations. 4. The method of claim 1 , wherein converting the system call into the vector representation of the system call comprises: converting the system call into a tokenized representation of the system call by mapping a system call feature of the system call to a first token and one or more argument features of the system call to one or more second tokens based on a system call feature mapping data structure and an argument feature mapping data structure. 5. The method of claim 4 , wherein processing the system call comprises: converting the tokenized representation of the system call to a vector representation of the system call by using the first token to index into a system call feature embedding matrix data structure and retrieving a system call feature embedding corresponding to the first token, and using the at least one or more second tokens to index into an argument feature embedding matrix data structure and retrieving corresponding argument feature embeddings corresponding to the one or more second tokens; and concatenating the system call feature embedding and the one or more argument feature embeddings to generate the vector representation of the system call. 6. The method of claim 1 , wherein identifying, by the abnormal call sequence logic, a difference between the predicted system call sequence and the observed system call sequence based on results of the comparing further comprises: identifying the difference as an anomaly; maintaining, over a predetermined period of time, a count of a number of anomalies identified during the predetermined period of time; comparing the count of the number of anomalies to a threshold number of anomalies; and determining that the alert notification is to be generated in response to the number of anomalies being equal to or greater than the threshold number of anomalies. 7. The method of claim 1 , wherein identifying, by the abnormal call sequence logic, a difference between the predicted system call sequence and the observed system call sequence based on results of the comparing further comprises: comparing a probability of the predicted system call sequence to a threshold probability value; and in response to the probability of the predicted system call sequence being equal to or greater than the threshold probability value, and the existence of the difference between the predicted system call sequence and the observed system call sequence, determining that the alert notification is to be generated. 8. The method of claim 1 , further comprising: automatically performing a responsive action in response to identifying the difference between the predicted system call sequence and the observed system call sequence, wherein the responsive action comprises at least one of quarantining a process that submitted the observed system call sequence, blocking or filtering future system calls from the process that submitted the observed system call sequence, collecting data about the process that submitted the observed system call sequence, or terminating the process that submitted the observed system call sequence. 9. The method of claim 1 , further comprising: initializing a system call feature embedding data structure to an initial state; initializing an argument call feature embedding data structure to an initial state; and training the RNN based on a training dataset comprising a plurality of system call sequences, wherein the training of the RNN comprises iteratively modifying embedding values in at least one of the system call feature embedding data structure or the argument call feature embedding data structure to generate trained embedding values in the system call feature embedding data structure and the argument call feature embedding data structure. 10. A computer program product comprising a computer readable storage medium having a computer readable program stored therein, wherein the computer readable program, when executed on a data processing system, causes the data processing system to specifically configure the data processing system to: receive, from a computing system resource of the monitored computing environment, a system call of an observed system call sequence for evaluation; process, by a trained recurrent neural network (RNN) of the data processing system, trained to predict system call sequences, the system call to generate a prediction of a subsequent system call in a predicted system call sequence; compare, by abnormal call sequence logic of the data processing system, the subsequent system calls in the predicted system call sequence to an observed system call in the observed system call sequence; identify, by the abnormal call sequence logic, a difference between the predicted system call sequence and the observed system call sequence based on results of the comparing; and generate, by the abnormal call sequence logic, an alert notification in response to identifying the difference, wherein the computer readable program further configures the data processing system to process the system call at least by converting the system call into a vector representation o

Assignees

Inventors

Classifications

  • Combinations of networks · CPC title

  • Recurrent networks, e.g. Hopfield networks · CPC title

  • characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title

  • Supervised learning · CPC title

  • Backpropagation, e.g. using gradient descent · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11301563B2 cover?
Mechanisms are provided for detecting abnormal system call sequences in a monitored computing environment. The mechanisms receive, from a computing system resource of the monitored computing environment, a system call of an observed system call sequence for evaluation. A trained recurrent neural network (RNN), trained to predict system call sequences, processes the system call to generate a pre…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F21/554. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 12 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).