Machine learning model for analysis of instruction sequences

US11074494B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11074494-B2
Application numberUS-201615345433-A
CountryUS
Kind codeB2
Filing dateNov 7, 2016
Priority dateSep 9, 2016
Publication dateJul 27, 2021
Grant dateJul 27, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In one respect, there is provided a system for classifying an instruction sequence with a machine learning model. The system may include at least one processor and at least one memory. The memory may include program code that provides operations when executed by the at least one processor. The operations may include: processing an instruction sequence with a trained machine learning model configured to detect one or more interdependencies amongst a plurality of tokens in the instruction sequence and determine a classification for the instruction sequence based on the one or more interdependencies amongst the plurality of tokens; and providing, as an output, the classification of the instruction sequence. Related methods and articles of manufacture, including computer program products, are also provided.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for classifying code as malicious or benign to prevent the code from introducing undesirable and/or harmful behavior to a computing device, the system comprising: at least one processor; and at least one memory including program code which when executed by the at least one processor provides operations comprising: processing an instruction sequence with at least two trained machine learning models configured to at least detect one or more interdependencies amongst a plurality of tokens in the instruction sequence and to determine a classification for the instruction sequence based on the one or more interdependencies amongst the plurality of tokens, the classification indicating whether the instruction sequence is malicious or benign, at least one of the trained machine learning models using encoding to vectorize the instruction sequence so as to preserve similarities between tokens; and providing, as an output, the classification of the instruction sequence, the classification being used to determine whether to access, execute, or continue to execute the instruction sequence to prevent the undesirable and/or harmful behavior to the computing device; wherein: a first layer of the trained machine learning model encodes the tokens using one or more encoding techniques and generates vector representations of the tokens to pass to a next layer of the trained machine learning model; the instruction sequence comprises a script that requires compilation prior to execution; the one or more interdependencies indicate at least one function and/or behavior associated with the script; the trained machine learning model comprises a trained long short-term memory neural network that is trained to classify instruction sequences by at least using the long short-term memory neural network to process a plurality of training data, the training including instruction sequences that includes tokens having predetermined interdependencies, the long short-term memory neural network is trained to detect the predetermined interdependencies amongst the tokens in the training data, the long short-term memory neural network is trained to minimize an error function or a loss function associated with a corresponding output of the long short-term memory neural network; the encoding maximizes an objective function J(θ) in order to generate v vector representations that preserve similarities between tokens: J ⁡ ( θ ) = 1 T ⁢ ∑ t = 1 T ⁢ ⁢ ∑ - c ≤ j ≤ c , j ≠ 0 ⁢ log ⁢ ⁢ p ⁡ ( w t + j ❘ w t ) , wherein T is a total number of tokens in a training corpus, w t is a current token, c is a window size, w t+j represents a token in a window before or after w t , and p(w t+j |w t ) represents a probability of w t+j given w t , wherein p(w t+j |w t ) is: p ⁡ ( w t + j ❘ w t ) = exp ⁡ ( v w t + j ' T ⁢ v w t ) ∑ w = 1 W ⁢ exp ⁡ ( V w ' T ⁢ v w t ) , whe

Assignees

Inventors

Classifications

  • G06F21/563Primary

    by source code analysis · CPC title

  • Recurrent networks, e.g. Hopfield networks · CPC title

  • G06N3/0442Primary

    characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title

  • Supervised learning · CPC title

  • Machine learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11074494B2 cover?
In one respect, there is provided a system for classifying an instruction sequence with a machine learning model. The system may include at least one processor and at least one memory. The memory may include program code that provides operations when executed by the at least one processor. The operations may include: processing an instruction sequence with a trained machine learning model confi…
Who is the assignee on this patent?
Cylance Inc
What technology area does this patent fall under?
Primary CPC classification G06F21/563. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 27 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).