Systems and methods for identifying suspicious singleton files using correlational predictors

US10073983B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-10073983-B1
Application numberUS-201514966502-A
CountryUS
Kind codeB1
Filing dateDec 11, 2015
Priority dateDec 11, 2015
Publication dateSep 11, 2018
Grant dateSep 11, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The disclosed computer-implemented method for identifying suspicious singleton files using correlational predictors may include (1) identifying a set of known-clean computing devices that include no singleton files, (2) detecting at least one software component that is installed on a threshold number of the known-clean computing devices, (3) identifying an unvindicated computing device whose infection status is unknown, (4) determining that, in addition to being installed on the threshold number of known-clean computing devices, the software component is installed on the unvindicated computing device, (5) determining that the unvindicated computing device includes at least one singleton file, and then (6) classifying the singleton file as suspicious in response to determining that (A) the software component is installed on the unvindicated computing device and (B) the unvindicated computing device includes the singleton file. Various other methods, systems, and computer-readable media are also disclosed.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for identifying suspicious singleton files using correlational predictors, at least a portion of the method being performed by a computing device comprising at least one processor, the method comprising: identifying a set of known-clean computing devices that: include no singleton files; and are not infected by malware; detecting at least one software component that is installed on a threshold number of the known-clean computing devices; correlating, based on an analysis of the set of known-clean computing devices, that computing devices are benign when the computing devices have at least one software component installed and do not include singleton files; identifying an unvindicated computing device whose infection status is unknown; determining that, in addition to being installed on the threshold number of the known-clean computing devices, the software component is installed on the unvindicated computing device; determining that the unvindicated computing device includes at least one singleton file; and classifying the singleton file as suspicious in response to determining that: the software component is installed on the unvindicated computing device; and the unvindicated computing device includes the singleton file; and in response to classifying the singleton file as suspicious, performing a security action with respect to the singleton file. 2. The method of claim 1 , further comprising identifying an additional set of known-clean computing devices that include a plurality of singleton files; and wherein detecting the software component that is installed on the threshold number of the known-clean computing devices comprises: detecting whether the software component is installed on any of the known-clean computing devices that include the plurality of singleton files; calculating a ratio between the set of known-clean computing devices on which the software component is installed and the additional set of known-clean computing devices on which the software component is installed; and determining that the ratio has reached a certain threshold. 3. The method of claim 1 , wherein: determining that the software component is installed on the unvindicated computing device comprises determining, based at least in part on the software component being installed on the unvindicated computing device, that the unvindicated computing device is not expected to include any benign or malicious singleton files; and determining that the unvindicated computing device includes the singleton file comprises determining that the unvindicated computing device includes the singleton file even though the unvindicated computing device is not expected to include any benign or malicious singleton files. 4. The method of claim 1 , wherein classifying the singleton file as suspicious comprises subjecting the singleton file to an increased level of scrutiny due at least in part to the singleton file's classification as suspicious. 5. The method of claim 1 , further comprising: identifying an additional set of known-clean computing devices that include a plurality of singleton files; detecting at least one additional software component that is installed on a threshold number of the known-clean computing devices that include the plurality of singleton files; identifying an additional unvindicated computing device whose infection status is unknown; determining that, in addition to being installed on the threshold number of known-clean computing devices that include the plurality of singleton files, the additional software component is installed on the additional unvindicated computing device; determining that the additional unvindicated computing device includes at least one additional singleton file; and in response to determining that the additional unvindicated computing device includes the additional singleton file, performing at least one additional suspicion analysis on the additional singleton file to determine whether the additional singleton file is suspicious. 6. The method of claim 5 , wherein the additional suspicion analysis comprises: identifying at least one attribute of the additional singleton file included on the additional unvindicated computing device; and determining whether the attribute of the additional singleton file matches at least one attribute of a cluster of benign singleton files that have been linked to the additional software component. 7. The method of claim 6 , wherein determining whether the attribute of the additional singleton file matches the attribute of the cluster of benign singleton files comprises: determining that the attribute of the additional singleton file does not match the attribute of the cluster of benign singleton files; and in response to determining that the attribute of the additional singleton file does not match the attribute of the cluster of benign singleton files, classifying the additional singleton file as suspicious. 8. The method of claim 7 , wherein: determining that the additional software component is installed on the additional unvindicated computing device comprises determining, based at least in part on the additional software component being installed on the additional unvindicated computing device, that the additional unvindicated computing device is expected to include only singleton files that have the attribute of the cluster of benign singleton files; determining that the additional unvindicated computing device includes the additional singleton file comprises determining that the additional unvindicated computing device includes the additional singleton file even though the unvindicated computing device is expected to include only singleton files that have the attribute of the cluster of benign singleton files; and classifying the additional singleton file as suspicious comprises classifying the additional singleton file as suspicious due at least in part to: the unvindicated computing device being expected to include only singleton files that have the attribute of the cluster of benign singleton files; and the additional singleton file not having the attribute of the cluster of benign singleton files. 9. The method of claim 6 , wherein determining whether the attribute of the additional singleton file matches the attribute of the cluster of benign singleton files comprises: determining that the attribute of the additional singleton file matches the attribute of the cluster of benign singleton files; and in response to determining that the attribute of the additional singleton file matches the attribute of the cluster of benign singleton files, classifying the additional singleton file as benign. 10. The method of claim 6 , wherein the attribute of the cluster of benign singleton files comprises: a character string included in file names of the benign singleton files; a character string included in folder names in which the benign singleton files have been identified; and a file size metric that represents sizes of the benign singleton files. 11. A system for identifying suspicious singleton files using correlational predictors, the system comprising: an identification module, stored in memory, that: identifies a set of known-clean computing devices that: include no singleton files; and are not infected by malware; correlates, based on an analysis of the set of known-clean computing devices, that computing devices are benign when the computing devices have at least one software component installed and do not include singleton files; identifies an unvindicated computing device whose infection status is unknown; a detection mod

Assignees

Inventors

Classifications

  • to a system of files or objects, e.g. local or distributed file system or database · CPC title

  • G06F21/565Primary

    by checking file integrity · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10073983B1 cover?
The disclosed computer-implemented method for identifying suspicious singleton files using correlational predictors may include (1) identifying a set of known-clean computing devices that include no singleton files, (2) detecting at least one software component that is installed on a threshold number of the known-clean computing devices, (3) identifying an unvindicated computing device whose in…
Who is the assignee on this patent?
Symantec Corp
What technology area does this patent fall under?
Primary CPC classification G06F21/6218. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 11 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).