Malicious content analysis using simulated user interaction without user involvement
US-9104867-B1 · Aug 11, 2015 · US
US9294501B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9294501-B2 |
| Application number | US-201314042454-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 30, 2013 |
| Priority date | Sep 30, 2013 |
| Publication date | Mar 22, 2016 |
| Grant date | Mar 22, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A computerized method is described in which a received object is analyzed by a malicious content detection (MCD) system to determine whether the object is malware or non-malware. The analysis may include the generation of a fuzzy hash based on a collection of behaviors for the received object. The fuzzy hash may be used by the MCD system to determine the similarity of the received object with one or more objects in previously classified/analyzed clusters. Upon detection of a “similar” object, the suspect object may be associated with the cluster and classified based on information attached to the cluster. This similarity matching provides 1) greater flexibility in analyzing potential malware objects, which may share multiple characteristics and behaviors but are also slightly different from previously classified objects and 2) a more efficient technique for classifying/assigning attributes to objects.
Opening claim text (preview).
What is claimed: 1. A computerized method for classifying objects in a malware system, comprising: receiving, by a malicious content detection (MCD) system from a client device, an object to be classified; detecting behaviors of the received object, wherein the behaviors are detected after processing the received object; generating a fuzzy hash for the received object based on the detected behaviors, the generating of the fuzzy hash comprises (i) obtaining a reduced amount of data associated with the detected behaviors by retaining a portion of the data associated with the detected behaviors that corresponds to one or more operations conducted during processing of the received object, and removing metadata associated with the one or more operations conducted during the processing of the received object, the metadata including at least one or more identifiers of processes called during the processing of the received object, and (ii) performing a hash operation on the reduced amount of data associated with the detected behaviors; comparing the fuzzy hash for the received object with a fuzzy hash of an object in a preexisting cluster to generate a similarity measure; associating the received object with the preexisting cluster in response to determining that the similarity measure is above a predefined threshold value; creating a new cluster for the received object in response to determining that the similarity measure is below the predefined threshold value; and reporting, by the MCD system, results of either (i) the associating of the received object with the preexisting cluster or (ii) the creating of the new cluster. 2. The computerized method of claim 1 , wherein the received object is at least one of a file, a uniform resource locator, a web object, a capture of network traffic for a user over time, and an email message. 3. The computerized method of claim 1 , wherein the removed metadata associated with the corresponding operations includes metadata associated with one or more of (1) network calls, (2) modifications to a registry, (3) modifications to a file system, or (4) an application program interface call. 4. The computerized method of claim 1 , further comprising: generating a preliminary malware score for the received object based on a comparison of the reduced amount of data associated with the detected behaviors with data associated with known malware behaviors, wherein the preliminary malware score indicates the probability the received object is malware; and generating a final malware score for the received object based on the cluster the received object is associated, wherein the final malware score is greater than the preliminary malware score when the received object is associated with a cluster of objects classified as malware and the final malware score is less than the preliminary malware score when the received object is associated with a cluster of objects classified as non-malware. 5. The computerized method of claim 1 , wherein the removing of the metadata associated with the one or more operations comprises removing data that does not identify the received object. 6. The computerized method of claim 5 , wherein the removing of the metadata further comprises removing at least a portion of values written to a registry by the received object. 7. The computerized method of claim 1 , further comprising: transmitting, by the MCD system, the new cluster or the preexisting cluster with the newly associated received object to another MCD system. 8. The computerized method of claim 1 , further comprising: classifying the received object as malware, non-malware, or with an unknown status to match a classification of the preexisting cluster, when the received object is assigned to the preexisting cluster. 9. The computerized method of claim 1 , further comprising: assigning a malware family name to the received object to match a malware family name of the preexisting cluster, when the received object is assigned to the preexisting cluster. 10. The computerized method of claim 1 , wherein the generating of the fuzzy hash further comprises at least one of (a) retaining one or more image paths in an associated file system corresponding to a location of a file that is generated or modified during the processing of the received object or (b) removing a file name prior to performing the hash operation on the data associated with the detected behaviors. 11. The computerized method of claim 1 , wherein the generating of the fuzzy hash comprises retaining only the one or more image paths corresponding to operations conducted during processing of the received object as part of the data associated with the detected behaviors. 12. A non-transitory storage medium including instructions that, when executed by one or more hardware processors, performs a plurality of operations, comprising: detecting behaviors of a received object, wherein the behaviors are detected after processing the received object; generating a fuzzy hash for the received object based on the detected behaviors, the generating of the fuzzy hash comprises (i) obtaining a reduced amount of data associated with the detected behaviors by retaining a portion of the data associated with the detected behaviors that corresponds to one or more operations conducted during processing of the received object, and removing metadata associated with the one or more operations conducted during the processing of the received object, the metadata including at least one or more identifiers of processes called during the processing of the received object metadata, and (ii) performing a hash operation on the reduced amount of data associated with the detected behaviors; comparing the fuzzy hash for the received object with a fuzzy hash of an object in a preexisting cluster to generate a similarity measure; associating the received object with the preexisting cluster in response to determining that the similarity measure is above a predefined threshold value; creating a new cluster for the received object in response to determining that the similarity measure is below the predefined threshold value; and reporting results of either (i) the associating of the received object with the preexisting cluster or (ii) the creating of the new cluster. 13. The non-transitory storage medium of claim 12 , wherein the received object is one of a file, a uniform resource locator, a web object, a capture of network traffic for a user over time, and an email message. 14. The non-transitory storage medium of claim 12 , wherein the removed metadata associated with the one or more operations includes metadata associated with one or more of (1) network calls, (2) modifications to a registry, (3) modifications to a file system, or (4) an application program interface call. 15. The non-transitory storage medium of claim 12 further includes instructions that, when executed by the one or more hardware processors, perform a plurality of operations comprising: generating a preliminary malware score for the received object based on a comparison of the reduced amount of data associated with the detected behaviors with data associated with known malware behaviors, wherein the preliminary malware score indicates the probability the received object is malware; and generating a final malware score for the received object based on the cluster the received object is associated, wherein the final malware score is greater than the preliminary malware score when the received object is associated with a cluster of objects classified as malware and the final malware score is less than the pr
Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities · CPC title
the attack involving the propagation of malware through the network, e.g. viruses, trojans or worms · CPC title
Computer malware detection or handling, e.g. anti-virus arrangements · CPC title
by monitoring network traffic (monitoring network traffic per se H04L43/00) · CPC title
Test or assess a computer or a system · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.