Automatic threat detection of executable files based on static data analysis
US-2016335435-A1 · Nov 17, 2016 · US
US10635814B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10635814-B2 |
| Application number | US-201816183624-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 7, 2018 |
| Priority date | Jul 15, 2015 |
| Publication date | Apr 28, 2020 |
| Grant date | Apr 28, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In one respect, there is provided a system for training a neural network adapted for classifying one or more scripts. The system may include at least one processor and at least one memory. The memory may include program code which when executed by the at least one memory provides operations including: receiving a disassembled binary file that includes a plurality of instructions; processing the disassembled binary file with a convolutional neural network configured to detect a presence of one or more sequences of instructions amongst the plurality of instructions and determine a classification for the disassembled binary file based at least in part on the presence of the one or more sequences of instructions; and providing, as an output, the classification of the disassembled binary file. Related computer-implemented methods are also disclosed.
Opening claim text (preview).
What is claimed is: 1. A system, comprising: at least one processor; and at least one memory including program code which when executed by the at least one memory provides operations comprising: receiving a disassembled binary file that includes a plurality of instructions, at least a portion of the instructions being variable in length; generating fixed length representations of the plurality of instructions by at least one of truncating or padding each of the plurality of instructions to a same length; encoding the generated fixed length representations for more efficient processing by a convolutional neural network; processing the disassembled binary file with a trained convolutional neural network configured to detect a presence of one or more sequences of instructions amongst the plurality of instructions and determine a classification for the disassembled binary file based at least in part on the presence of the one or more sequences of instructions; and providing, as an output, the classification of the disassembled binary file to determine whether to execute, open, or access a binary file corresponding to the disassembled binary file; wherein the convolutional neural network is configured to: apply a first plurality of kernels to the disassembled binary file, and wherein each of the first plurality of kernels is adapted to detect a different sequence of two or more instructions; and subsequently apply a second plurality of kernels to the disassembled binary file, and wherein each of the second plurality of kernels is adapted to detect a different sequence of two or more sequences of instructions. 2. The system of claim 1 , wherein the fixed length representations of the plurality of instructions includes a mnemonic associated with each instruction. 3. The system of claim 1 , wherein the encoding is based on one-hot encoding or binary encoding. 4. The system of claim 1 , wherein applying the first plurality of kernels includes applying a first weight matrix to a matrix representation of the disassembled binary file, and wherein the matrix representation of the disassembled binary file comprises encoded fixed length representations of the plurality of instructions included in the disassembled binary file. 5. The system of claim 4 , wherein the system is further configured to train the convolutional neural network by at least: receiving a plurality of training files, wherein the plurality of training files comprises a plurality of disassembled binary files; determining a classification of a first training file by at least processing the first training file with the convolutional neural network; back propagating an error associated with the classification of the first training file; and adjusting at least the first weight matrix to minimize the error associated with the classification of the first training file. 6. The system of claim 5 , wherein training the convolutional neural network further comprises: determining a classification for a second training file by at least processing the second training file with the convolutional neural network; back propagating an error associated with the classification of the second training file; and readjusting at least the first weight matrix to minimize the error associated with the classification of the second training file. 7. A computer-implemented method, comprising: receiving a disassembled binary file that includes a plurality of instructions, at least a portion of the instructions being variable in length; generating fixed length representations of the plurality of instructions by at least one of truncating or padding each of the plurality of instructions to a same length; encoding the generated fixed length representations for more efficient processing by a convolutional neural network; processing the disassembled binary file with a trained convolutional neural network configured to detect a presence of one or more sequences of instructions amongst the plurality of instructions and determine a classification for the disassembled binary file based at least in part on the presence of the one or more sequences of instructions; and providing, as an output, the classification of the disassembled binary file to determine whether to execute, open, or access a binary file corresponding to the disassembled binary file; wherein the convolutional neural network is configured to: apply a first plurality of kernels to the disassembled binary file, and wherein each of the first plurality of kernels is adapted to detect a different sequence of two or more instructions; and subsequently apply a second plurality of kernels to the disassembled binary file, and wherein each of the second plurality of kernels is adapted to detect a different sequence of two or more sequences of instructions. 8. The method of claim 7 , wherein the fixed length representations of the plurality of instructions includes a mnemonic associated with each instruction. 9. The method of claim 7 , wherein the encoding is based on one-hot encoding or binary encoding. 10. The method of claim 7 , wherein applying the first plurality of kernels includes applying a first weight matrix to a matrix representation of the disassembled binary file, and wherein the matrix representation of the disassembled binary file comprises encoded fixed length representations of the plurality of instructions included in the disassembled binary file. 11. The method of claim 10 , further comprising training the convolutional neural network by at least: receiving a plurality of training files, wherein the plurality of training files comprises a plurality of disassembled binary files; determining a classification of a first training file by at least processing the first training file with the convolutional neural network; back propagating an error associated with the classification of the first training file; adjusting at least the first weight matrix to minimize the error associated with the classification of the first training file. 12. The method of claim 11 , wherein training the convolutional neural network further comprises: determining a classification for a second training file by at least processing the second training file with the convolutional neural network; back propagating an error associated with the classification of the second training file; and readjusting at least the first weight matrix to minimize the error associated with the classification of the second training file. 13. A computer-implemented method, comprising: receiving a disassembled binary file that includes a plurality of instructions, at least a portion of the instructions being variable in length; generating fixed length representations of the plurality of instructions by at least one of truncating or padding each of the plurality of instructions to a same length; encoding the generated fixed length representations for more efficient processing by a convolutional neural network; processing the disassembled binary file with a trained convolutional neural network configured to apply two different pluralities of kernels in sequence to detect a presence of one or more sequences of instructions amongst the plurality of instructions and determine a classification for the disassembled binary file based at least in part on the presence of the one or more sequences of instructions, each kernel being configured to detect a specific, different sequence of instructions; and providing, as an output, the classification of the disassembled binary file to determine whether to execute, open, or access a binary file corresponding to the disassembled binary
Static detection · CPC title
Test or assess a computer or a system · CPC title
by checking file integrity · CPC title
Learning methods · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.