Computing platform security methods and apparatus
US-2016328562-A1 · Nov 10, 2016 · US
US10853489B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10853489-B2 |
| Application number | US-201816165051-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 19, 2018 |
| Priority date | Oct 19, 2018 |
| Publication date | Dec 1, 2020 |
| Grant date | Dec 1, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques are provided for data-driven ensemble-based malware detection. An exemplary method comprises obtaining a file; extracting metadata from the file; obtaining a plurality of malware detection procedures; selecting a subset of the plurality of malware detection procedures to apply to the file utilizing a likelihood that each of the plurality of malware detection procedures will result in a malware detection for the file based on the extracted metadata; applying the selected subset of the malware detection procedures to the file; and processing results of the subset of malware detection procedures using a machine learning model to determine a probability of the file being malware.
Opening claim text (preview).
What is claimed is: 1. A method, comprising: obtaining a file; extracting metadata from the file; obtaining a plurality of malware detection procedures; selecting, using at least one processing device, a subset of the plurality of malware detection procedures to apply to the file utilizing a likelihood that each of the plurality of malware detection procedures will result in a malware detection for the file based on the extracted metadata; applying, using the at least one processing device, the selected subset of the malware detection procedures to the file; and processing, using the at least one processing device, results of the subset of malware detection procedures using a machine learning model to determine a probability of the file being malware. 2. The method of claim 1 , wherein the step of selecting the subset of the malware detection procedures to apply to the file employs a Bayesian model that determines a probability that a given malware detection procedure will detect malware in the given file based on one or more historical executions of the given malware detection procedure and characteristics of historical files on which the given malware detection procedure was previously executed. 3. The method of claim 2 , further comprising the step of updating the Bayesian model as new files are tested by the given malware detection procedure. 4. The method of claim 2 , further comprising the step of obtaining a configuration of one or more of a substantially maximum number of malware detection procedures to be executed for a given file, a detection probability threshold, and one or more metadata features to be used for training the Bayesian model. 5. The method of claim 1 , wherein the step of processing the results of the subset of the malware detection procedures using the machine learning model employs a supervised machine learning model that processes the results of the subset of the malware detection procedures as an input and models relationships within the results to generate a health score indicating whether the file is malware. 6. The method of claim 5 , wherein the supervised machine learning model is trained using a plurality of historical files classified as malware as positive examples and a plurality of historical files classified as non-malicious as negative examples. 7. The method of claim 1 , further comprising the step of generating one or more alerts for a detected malware based on one or more of a user configuration, at least one predefined rule and a predefined policy. 8. A system, comprising: a memory; and at least one processing device, coupled to the memory, operative to implement the following steps: obtaining a file; extracting metadata from the file; obtaining a plurality of malware detection procedures; selecting a subset of the plurality of malware detection procedures to apply to the file utilizing a likelihood that each of the plurality of malware detection procedures will result in a malware detection for the file based on the extracted metadata; applying the selected subset of the malware detection procedures to the file; and processing results of the subset of malware detection procedures using a machine learning model to determine a probability of the file being malware. 9. The system of claim 8 , wherein the step of selecting the subset of the malware detection procedures to apply to the file employs a Bayesian model that determines a probability that a given malware detection procedure will detect malware in the given file based on one or more historical executions of the given malware detection procedure and characteristics of historical files on which the given malware detection procedure was previously executed. 10. The system of claim 9 , further comprising the steps of updating the Bayesian model as new files are tested by the given malware detection procedure and obtaining a configuration of one or more of a substantially maximum number of malware detection procedures to be executed for a given file, a detection probability threshold, and one or more metadata features to be used for training the Bayesian model. 11. The system of claim 8 , wherein the step of processing the results of the subset of the malware detection procedures using the machine learning model employs a supervised machine learning model that processes the results of the subset of the malware detection procedures as an input and models relationships within the results to generate a health score indicating whether the file is malware. 12. The system of claim 11 , wherein the supervised machine learning model is trained using a plurality of historical files classified as malware as positive examples and a plurality of historical files classified as non-malicious as negative examples. 13. The system of claim 8 , further comprising the step of generating one or more alerts for a detected malware based on one or more of a user configuration, at least one predefined rule and a predefined policy. 14. A computer program product, comprising a non-transitory machine-readable storage medium having encoded therein executable code of one or more software programs, wherein the one or more software programs when executed by at least one processing device perform the following steps: obtaining a file; extracting metadata from the file; obtaining a plurality of malware detection procedures; selecting a subset of the plurality of malware detection procedures to apply to the file utilizing a likelihood that each of the plurality of malware detection procedures will result in a malware detection for the file based on the extracted metadata; applying the selected subset of the malware detection procedures to the file; and processing results of the subset of malware detection procedures using a machine learning model to determine a probability of the file being malware. 15. The computer program product of claim 14 , wherein the step of selecting the subset of the malware detection procedures to apply to the file employs a Bayesian model that determines a probability that a given malware detection procedure will detect malware in the given file based on one or more historical executions of the given malware detection procedure and characteristics of historical files on which the given malware detection procedure was previously executed. 16. The computer program product of claim 15 , further comprising the step of updating the Bayesian model as new files are tested by the given malware detection procedure. 17. The computer program product of claim 15 , further comprising the step of obtaining a configuration of one or more of a substantially maximum number of malware detection procedures to be executed for a given file, a detection probability threshold, and one or more metadata features to be used for training the Bayesian model. 18. The computer program product of claim 14 , wherein the step of processing the results of the subset of the malware detection procedures using the machine learning model employs a supervised machine learning model that processes the results of the subset of the malware detection procedures as an input and models relationships within the results to generate a health score indicating whether the file is malware. 19. The computer program product of claim 18 , wherein the supervised machine learning model is trained using a plurality of historical files classified as malware as positive examples and a plurality of historical files classified as non-malicious as negative examples. 20. The
Probabilistic graphical models, e.g. probabilistic networks · CPC title
Machine learning · CPC title
Test or assess software · CPC title
involving event detection and direct action · CPC title
eliminating virus, restoring damaged files · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.