Flexible fingerprint for detection of malware
US-2015007319-A1 · Jan 1, 2015 · US
US9959407B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-9959407-B1 |
| Application number | US-201615071049-A |
| Country | US |
| Kind code | B1 |
| Filing date | Mar 15, 2016 |
| Priority date | Mar 15, 2016 |
| Publication date | May 1, 2018 |
| Grant date | May 1, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A computer-implemented method for identifying potentially malicious singleton files may include (1) identifying a set of benign singleton files and a set of malicious singleton files, (2) obtaining, for each singleton file in the sets of benign and malicious singleton files, file identification information that identifies the singleton file, (3) using the file identification information of the singleton files from the sets of benign and malicious singleton files to train a classifier to classify unknown singleton files, (4) detecting an unclassified singleton file, (5) analyzing, with the trained classifier, information that identifies the unclassified singleton file, (6) determining, based on the analysis of the information that identifies the unclassified singleton file, that the unclassified singleton file is suspicious, and (7) triggering a security action in response to determining that the unclassified singleton file is suspicious. Various other methods, systems, and computer-readable media are also disclosed.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method for identifying potentially malicious singleton files, at least a portion of the method being performed by a computing device comprising at least one processor, the method comprising: identifying a set of benign singleton files and a set of malicious singleton files; obtaining, for each singleton file in the sets of benign and malicious singleton files, file identification information that identifies the singleton file; using the file identification information of the singleton files from the sets of benign and malicious singleton files to train a classifier to classify unknown singleton files; detecting an unclassified singleton file; analyzing, with the trained classifier, information that identifies the unclassified singleton file; determining, based on the analysis of the information that identifies the unclassified singleton file, that the unclassified singleton file is suspicious; triggering a security action in response to determining that the unclassified singleton file is suspicious. 2. The method of claim 1 , wherein identifying the sets of benign and malicious singleton files comprises at least one of: filtering representative samples of benign singleton files to obtain a comparable set size of benign singleton files to malicious singleton files; filtering the representative samples of benign singleton files to obtain a smaller set size of benign singleton files than malicious singleton files. 3. The method of claim 2 , wherein the representative samples of benign singleton files comprise calculated centroid values of clusters of benign singleton files. 4. The method of claim 1 , wherein the singleton file within the sets of benign and malicious singleton files comprises a file different from any other file stored within a plurality of computing devices. 5. The method of claim 1 , wherein the file identification information that identifies the singleton file comprises at least one of: a filename; an inverse of the filename; a file path; a size of the file; a file header; a file entropy; an external library that the singleton file uses; a function imported by the singleton file; a computing device on which the singleton file resides. 6. The method of claim 1 , wherein using the file identification information to train the classifier comprises: deriving features from the file identification information; using a machine learning model to classify the features derived from the file identification information. 7. The method of claim 1 , wherein analyzing the information that identifies the unclassified singleton file comprises: converting the information that identifies the unclassified singleton file into at least one feature; using the trained classifier to classify the unclassified singleton file based on the feature. 8. The method of claim 1 , wherein determining that the unclassified singleton file is suspicious comprises at least one of: determining that the unclassified singleton file is classified as malicious using the trained classifier; determining that the information that identifies the unclassified singleton file is similar to a malicious singleton file in the set of malicious singleton files. 9. The method of claim 1 , wherein the security action comprises at least one of: triggering an alert that the unclassified singleton file is suspicious; confirming that the unclassified singleton file is malicious; removing the unclassified singleton file from a computing device on which the unclassified singleton file resides. 10. The method of claim 9 , further comprising adding the unclassified singleton file to the set of malicious singleton files in response to confirming that the unclassified singleton file is malicious. 11. A system for identifying potentially malicious singleton files, the system comprising: at least one physical processor; and a system memory having stored therein one or more computer-executable instructions that, when executed by the at least one physical processor, cause the system to perform the following: identify a set of benign singleton files and a set of malicious singleton files; obtain, for each singleton file in the sets of benign and malicious singleton files, file identification information that identifies the singleton file; use the file identification information of the singleton files from the sets of benign and malicious singleton files to train a classifier to classify unknown singleton files; detect an unclassified singleton file; analyze, with the trained classifier, information that identifies the unclassified singleton file; determine, based on the analysis of the information that identifies the unclassified singleton file, that the unclassified singleton file is suspicious; and trigger a security action in response to determining that the unclassified singleton file is suspicious. 12. The system of claim 11 , wherein the system identifies the sets of benign and malicious singleton files by at least one of: filtering representative samples of benign singleton files to obtain a comparable set size of benign singleton files to malicious singleton files; filtering the representative samples of benign singleton files to obtain a smaller set size of benign singleton files than malicious singleton files. 13. The system of claim 12 , wherein the representative samples of benign singleton files comprise calculated centroid values of clusters of benign singleton files. 14. The system of claim 11 , wherein the singleton file within the sets of benign and malicious singleton files comprises a file different from any other file stored within a plurality of computing devices. 15. The system of claim 11 , wherein the file identification information that identifies the singleton file comprises at least one of: a filename; an inverse of the filename; a file path; a size of the file; a file header; a file entropy; an external library that the singleton file uses; a function imported by the singleton file; a computing device on which the singleton file resides. 16. The system of claim 11 , wherein the system uses the file identification information to train the classifier by: deriving features from the file identification information; using a machine learning model to classify the features derived from the file identification information. 17. The system of claim 11 , wherein the system analyzes the information that identifies the unclassified singleton file by: converting the information that identifies the unclassified singleton file into at least one feature; using the trained classifier to classify the unclassified singleton file based on the feature. 18. The system of claim 11 , wherein the system determines that the unclassified singleton file is suspicious by at least one of: determining that the unclassified singleton file is classified as malicious using the trained classifier; determining that the information that identifies the unclassified singleton file is similar to a malicious singleton file in the set of malicious singleton files. 19. The system of claim 11 , wherein the security action comprises at least one of: triggering an alert that the unclassified singleton file is suspicious; confirming that the unclassified singleton file is malicious; removing the unclassified singleton file from a computing device on which the unclassified singleton file resides. 20. A non-transitory computer-readable medium comprising o
Computer malware detection or handling, e.g. anti-virus arrangements · CPC title
Physics · mapped topic
Machine learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.