What technology area does this patent fall under?

Primary CPC classification G06F21/564. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jun 23 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Recurrent neural networks for malware analysis

US10691799B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10691799-B2
Application number	US-201615566687-A
Country	US
Kind code	B2
Filing date	Apr 15, 2016
Priority date	Apr 16, 2015
Publication date	Jun 23, 2020
Grant date	Jun 23, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Using a recurrent neural network (RNN) that has been trained to a satisfactory level of performance, highly discriminative features can be extracted by running a sample through the RNN, and then extracting a final hidden state hh where i is the number of instructions of the sample. This resulting feature vector may then be concatenated with the other hand-engineered features, and a larger classifier may then be trained on hand-engineered as well as automatically determined features. Related apparatus, systems, techniques and articles are also described.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: receiving or accessing executable code comprising instructions; disassembling the executable code to generate a trace of the instructions; applying a recurrent neural network (RNN) to the trace to generate a hidden state corresponding to each instruction to form a feature vector; generating a concatenation of the feature vector with hand-engineered features extracted from the executable code; determining, using a classifier and the concatenation, a likelihood that the executable code comprises malicious code; and disallowing, based on the determining, the code from executing; wherein the classifier is different from the RNN. 2. The method of claim 1 , wherein the applying further comprises: dividing the trace into a plurality of regions; determining an entropy of each of the plurality of regions; and ignoring each region with a low entropy. 3. The method of claim 1 , wherein the disassembling further comprises: determining an entry point of the executable code; and generating a time-based trace of the instructions based on the entry point. 4. The method of claim 1 , wherein an input to the RNN is set to a fixed length of 4 or 8 bytes per instruction. 5. The method of claim 1 , wherein an instruction set of the executable code comprises an x86 instruction set. 6. The method of claim 1 , wherein the RNN is at least one of an Elman network, a long short-term memory network, a clockwork RNN, or an echo-state network. 7. The method of claim 1 , wherein applying the recurrent neural network further comprises applying backpropagation through time (BPTT). 8. The method of claim 1 , wherein applying the recurrent neural network further comprises deobfuscating or decompressing the trace. 9. A system comprising: one or more data processors having memory storing instructions, which when executed result in operations comprising: receiving or accessing executable code comprising instructions; disassembling the executable code to generate a trace of the instructions; applying a recurrent neural network (RNN) to the trace to generate a hidden state corresponding to each instruction to form a feature vector; generating a concatenation of the feature vector with hand-engineered features extracted from the executable code; determining, using a classifier and the concatenation, a likelihood that the executable code comprises malicious code; and disallowing, based on the determining, the code from executing; wherein the classifier is different from the RNN. 10. The system of claim 9 , wherein the applying further comprises: dividing the trace into a plurality of regions; determining an entropy of each of the plurality of regions; and ignoring each region with a low entropy. 11. The system of claim 9 , wherein the disassembling further comprises: determining an entry point of the executable code; and generating a time-based trace of the instructions based on the entry point. 12. The system of claim 9 , wherein an input to the RNN is set to a fixed length of 4 or 8 bytes per instruction. 13. The system of claim 9 , wherein an instruction set of the executable code comprises an x86 instruction set. 14. The system of claim 9 , wherein the RNN is at least one of an Elman network, a long short-term memory network, a clockwork RNN, or an echo-state network. 15. The system of claim 9 , wherein applying the recurrent neural network further comprises applying backpropagation through time (BPTT). 16. The system of claim 9 , wherein applying the recurrent neural network further comprises deobfuscating or decompressing the trace. 17. A non-transitory computer readable storage medium storing one or more programs configured to be executed by one or more data processors, the one or more programs comprising instructions, the instructions comprising: receiving executable code; disassembling the executable code; generating a hidden state for each of a plurality of instructions by applying a recurrent neural network (RNN) to the disassembled executable code to generate a feature vector; and determining, using a classifier, a likelihood that the executable code comprises malicious code based on the feature vector; wherein the classifier is different from the RNN. 18. The non-transitory computer readable storage medium of claim 17 , wherein the applying further comprises: dividing the trace into a plurality of regions; determining an entropy of each of the plurality of regions; and ignoring each region with a low entropy. 19. The non-transitory computer readable storage medium of claim 17 , wherein the disassembling further comprises: determining an entry point of the executable code; and generating a time-based trace of the instructions based on the entry point. 20. The non-transitory computer readable storage medium of claim 17 , wherein applying the recurrent neural network further comprises deobfuscating or decompressing the trace.

Assignees

Cylance Inc

Inventors

Classifications

G06N3/044
Recurrent networks, e.g. Hopfield networks · CPC title
G06N3/08
Learning methods · CPC title
G06N3/0442
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
G06N3/09
Supervised learning · CPC title
G06F21/564Primary
by virus signature recognition · CPC title

Patent family

Related publications grouped by family.

View patent family 55861222

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10691799B2 cover?: Using a recurrent neural network (RNN) that has been trained to a satisfactory level of performance, highly discriminative features can be extracted by running a sample through the RNN, and then extracting a final hidden state hh where i is the number of instructions of the sample. This resulting feature vector may then be concatenated with the other hand-engineered features, and a larger class…
Who is the assignee on this patent?: Cylance Inc
What technology area does this patent fall under?: Primary CPC classification G06F21/564. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jun 23 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Wavelet decomposition of software entropy to identify malware

Static feature extraction from structured files

Application Execution Control Utilizing Ensemble Machine Learning For Discernment

Method and apparatus for constructing a neuroscience-inspired artificial neural network

Generation of API call graphs from static disassembly

Frequently asked questions