What technology area does this patent fall under?

Primary CPC classification G06F21/566. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Nov 15 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Recurrent neural networks for malware analysis

US9495633B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9495633-B2
Application number	US-201514789914-A
Country	US
Kind code	B2
Filing date	Jul 1, 2015
Priority date	Apr 16, 2015
Publication date	Nov 15, 2016
Grant date	Nov 15, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Using a recurrent neural network (RNN) that has been trained to a satisfactory level of performance, highly discriminative features can be extracted by running a sample through the RNN, and then extracting a final hidden state h i , where i is the number of instructions of the sample. This resulting feature vector may then be concatenated with the other hand-engineered features, and a larger classifier may then be trained on hand-engineered as well as automatically determined features. Related apparatus, systems, techniques and articles are also described.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving or accessing data encapsulating a sample of at least a portion of one or more files; feeding at least a portion of the received or accessed data as a time-based sequence into a recurrent neural network (RNN) trained using historical data; extracting, by the RNN, a final hidden state h i in a hidden layer of the RNN in which i is a number of elements of the sample; and determining, using the RNN and the final hidden state, whether at least a portion of the sample is likely to comprise malicious code. 2. The method of claim 1 , wherein the received or accessed data forms at least part of a data stream. 3. The method of claim 1 , wherein the at least a portion of the received or accessed data comprises a series of fixed-length encoded words. 4. The method of claim 1 , wherein the elements comprises a series of instructions. 5. The method of claim 1 , wherein the hidden state is defined by: h t =f (x, h t-1 ), wherein hidden state h t is a time-dependent function of input x as well as a previous hidden state h t-1 . 6. The method of claim 1 , wherein the RNN is an Elman network. 7. The method of claim 6 , wherein the Elman network has deep transition or decoding functions. 8. The method of claim 6 , wherein the Elman network parameterizes f (x, h t-1 ) as h t =g (W 1X +Rh t-1 ); where hidden state h t is a time-dependent function of input x as well as previous hidden state h t-1 , W 1 is a matrix defining input-to-hidden connections, R is a matrix defining the recurrent connections, and g (•) is a differentiable nonlinearity. 9. The method of claim 8 further comprising: adding an output layer on top of the hidden layer, such that o t =a (W 2 h t ) where o t is output, W 2 defines a linear transformation of hidden activations, and σ (•) is a logistic function. 10. The method of claim 9 further comprising: applying backpropagation through time by which parameters of network W 2 , W 1 , and R are iteratively refined to drive the output o t to a desired value as portions of the received or accessed data are passed through the RNN. 11. The method of claim 1 , wherein the RNN is a long short term memory network. 12. The method of claim 1 , wherein the RNN is a clockwork RNN. 13. The method of claim 1 , wherein the RNN is a deep transition function. 14. The method of claim 1 , wherein the RNN is an echo-state network. 15. The method of claim 1 further comprising: providing data characterizing the determination. 16. The method of claim 15 , wherein providing data comprises at least one of: transmitting the data to a remote computing system, loading the data into memory, or storing the data. 17. The method of claim 1 , wherein the files are binary files. 18. The method of claim 1 , wherein the files are executable files. 19. A system comprising: at least one programmable data processor; and memory storing instructions which, when executed by the at least one programmable data processor, result in operations comprising: receiving or accessing data encapsulating a sample of at least a portion of one or more files; feeding at least a portion of the received or accessed data as a time-based sequence into a recurrent neural network (RNN) trained using historical data; extracting, by the RNN, a final hidden state h i in a hidden layer of the RNN in which i is a number of elements of the sample; and determining, using the RNN and the final hidden state, whether at least a portion of the sample is likely to comprise malicious code. 20. A non-transitory computer program product storing instructions which, when executed by at least one programmable data processor forming part of at least one computing device, result in operations comprising: receiving or accessing data encapsulating a sample of at least a portion of one or more files; feeding at least a portion of the received or accessed data as a time-based sequence into a recurrent neural network (RNN) trained using historical data; extracting, by the RNN, a final hidden state h i in a hidden layer of the RNN in which i is a number of elements of the sample; and determining, using the RNN and the final hidden state, whether at least a portion of the sample is likely to comprise malicious code.

Assignees

Inventors

Classifications

G06N3/084
Backpropagation, e.g. using gradient descent · CPC title
G06F21/566Primary
Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities · CPC title
G06N3/044
Recurrent networks, e.g. Hopfield networks · CPC title
G06N3/08
Learning methods · CPC title
G06N3/0442
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title

Patent family

Related publications grouped by family.

View patent family 55861222

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9495633B2 cover?: Using a recurrent neural network (RNN) that has been trained to a satisfactory level of performance, highly discriminative features can be extracted by running a sample through the RNN, and then extracting a final hidden state h i , where i is the number of instructions of the sample. This resulting feature vector may then be concatenated with the other hand-engineered features, and a larger cl…
Who is the assignee on this patent?: Cylance Inc, Cylance Inc
What technology area does this patent fall under?: Primary CPC classification G06F21/566. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Nov 15 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Static feature extraction from structured files

Application Execution Control Utilizing Ensemble Machine Learning For Discernment

Method and apparatus for constructing a neuroscience-inspired artificial neural network

Generation of API call graphs from static disassembly

Frequently asked questions