Apparatus and method for generating prediction model based on artificial neural network
US-2017351948-A1 · Dec 7, 2017 · US
US10366233B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-10366233-B1 |
| Application number | US-201615356526-A |
| Country | US |
| Kind code | B1 |
| Filing date | Nov 18, 2016 |
| Priority date | Nov 18, 2016 |
| Publication date | Jul 30, 2019 |
| Grant date | Jul 30, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The disclosed computer-implemented method for trichotomous malware classification may include (1) identifying a sample potentially representing malware, (2) selecting a machine learning model trained on a set of samples to distinguish between malware samples and benign samples, (3) analyzing the sample using a plurality of stochastically altered versions of the machine learning model to produce a plurality of classification results, (4) calculating a variance of the plurality of classification results, and (5) classifying the sample based at least in part on the variance of the plurality of classification results. Various other methods, systems, and computer-readable media are also disclosed.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method for trichotomous malware classification, at least a portion of the method being performed by one or more computing devices comprising at least one processor, the method comprising: identifying a sample potentially representing malware; selecting a machine learning model trained on a set of samples to distinguish between malware samples and benign samples, the machine learning model including one or more independent processing units; analyzing the sample using a plurality of stochastically altered versions of the machine learning model to produce a plurality of classification results, wherein analyzing the sample includes applying the selected machine learning model through a filter that modifies the operation of the processing units of the machine learning model dynamically as the processing units are applied to the sample; calculating a variance of the plurality of classification results; adjusting the calculated variance by accessing a precision value associated with the machine learning model and adding an inverse of the precision value to the calculated variance to derive a predictive variance of the machine learning model for the sample; and trichotomously classifying the sample based at least in part on the predictive variance of the plurality of classification results. 2. The computer-implemented method of claim 1 , further comprising performing a security action to protect the one or more computing devices in response to classifying the sample. 3. The computer-implemented method of claim 1 , wherein classifying the sample based at least in part on the variance of the plurality of classification results comprises classifying the sample as an uncertain sample rather than as a malware sample or a benign sample based on the variance exceeding a predetermined threshold. 4. The computer-implemented method of claim 1 , wherein classifying the sample based at least in part on the variance of the plurality of classification results comprises: analyzing the sample using the machine learning model to produce a probability that the sample is a malware sample; and determining that the probability that the sample is a malware sample falls within a probability window that is defined at least in part based on the variance of the plurality of classification results. 5. The computer-implemented method of claim 1 , wherein the machine learning model comprises a neural network. 6. The computer-implemented method of claim 5 , wherein training the neural network comprises applying dropout regularization when training the neural network. 7. The computer-implemented method of claim 5 , wherein analyzing the sample using the plurality of stochastically altered versions of the machine learning model to produce the plurality of classification results comprises generating the plurality of stochastically altered versions of the machine learning model by applying, for each stochastically altered version of the machine learning model within the plurality of stochastically altered versions of the machine learning model, a dropout mask randomly generated for the stochastically altered version of the machine learning model. 8. The computer-implemented method of claim 1 , wherein the machine learning model comprises a gradient tree boosting model. 9. The computer-implemented method of claim 8 , wherein analyzing the sample using the plurality of stochastically altered versions of the machine learning model to produce the plurality of classification results comprises generating the plurality of stochastically altered versions of the machine learning model by, for each stochastically altered version of the machine learning model within the plurality of stochastically altered versions of the machine learning model, randomly masking a subset of features within the stochastically altered version of the machine learning model. 10. The computer-implemented method of claim 1 , wherein the machine learning model comprises a random forest model. 11. The computer-implemented method of claim 10 , wherein analyzing the sample using the plurality of stochastically altered versions of the machine learning model to produce the plurality of classification results comprises: normalizing each feature within the random forest model to have zero mean and to have unit variance; and for each stochastically altered version of the machine learning model within the plurality of stochastically altered versions of the machine learning model, randomly determining, for at least one split, to replace the use of a feature at the split with the use of a different feature within the machine learning model. 12. A system for trichotomous malware classification, the system comprising: an identification module, stored in a memory, that identifies a sample potentially representing malware; a selection module, stored in the memory, that selects a machine learning model trained on a set of samples to distinguish between malware samples and benign samples, the machine learning model including one or more independent processing units; an analysis module, stored in the memory, that analyzes the sample using a plurality of stochastically altered versions of the machine learning model to produce a plurality of classification results, wherein analyzing the sample includes applying the selected machine learning model through a filter that modifies the operation of the processing units of the machine learning model dynamically as the processing units are applied to the sample; a calculation module, stored in the memory, that calculates a variance of the plurality of classification results and adjusts the calculated variance by accessing a precision value associated with the machine learning model and adding an inverse of the precision value to the calculated variance to derive a predictive variance of the machine learning model for the sample; a classifying module, stored in the memory, that trichotomously classifies the sample based at least in part on the predictive variance of the plurality of classification results; and at least one physical processor configured to execute the identification module, the selection module, the analysis module, the calculation module, and the classifying module. 13. The system of claim 12 , further comprising a performing module, stored in memory, that performs a security action in response to classifying the sample. 14. The system of claim 12 , wherein the classifying module classifies the sample based at least in part on the variance of the plurality of classification results by classifying the sample as an uncertain sample rather than as a malware sample or a benign sample based on the variance exceeding a predetermined threshold. 15. The system of claim 12 , wherein the classifying module classifies the sample based at least in part on the variance of the plurality of classification results by: analyzing the sample using the machine learning model to produce a probability that the sample is a malware sample; and determining that the probability that the sample is a malware sample falls within a probability window that is defined at least in part based on the variance of the plurality of classification results. 16. The system of claim 12 , wherein the machine learning model comprises a neural network. 17. The system of claim 16 , wherein the selection module further trains the neural network by applying dropout regularization when training the neural network. 18. The system of claim 16 , wherein the analysis module analyzes the sa
Probabilistic graphical models, e.g. probabilistic networks · CPC title
Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title
Probabilistic or stochastic networks · CPC title
Ensemble learning · CPC title
Static detection · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.