Malware detection using clustering with malware source information

US9710646B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-9710646-B1
Application numberUS-201313777995-A
CountryUS
Kind codeB1
Filing dateFeb 26, 2013
Priority dateFeb 26, 2013
Publication dateJul 18, 2017
Grant dateJul 18, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques for malware detection using clustering with malware source information are disclosed. In some embodiments, malware detection using clustering with malware source information includes generating a first cluster of source information associated with a first malware sample, in which the first malware sample was determined to be malware, and the first malware sample was determined to be downloaded from a first source; and determining that a second source is associated with malware based on the first cluster.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for malware detection using clustering with malware source information, comprising: a processor configured to: generate a first cluster of source information associated with a first malware sample, wherein the first malware sample was determined to be malware, wherein the first malware sample was determined to be downloaded from a first source, and wherein the generating of the first cluster of source information comprises: generate a directed graph associating a plurality of source information with the first malware sample to generate the first cluster of source information, the directed graph including a set of nodes and a set of edges, at least one node of the set of nodes relating to source information of a known malware, at least one edge of the set of edges representing a relationship based a first node of the set of nodes and a second node of the set of nodes, source information of the at least one node includes a domain, an IP address, or a combination thereof; the generating of the first cluster of source information being based at least in part on a clustering algorithm, the clustering algorithm including a recursive algorithm to find samples and domains that are correlated, the directed graph including: source information of the first malware sample including at least one of a) a source domain, a source Internet Protocol (IP) address, or a combination thereof, and b) a visiting domain, a visiting IP address, or a combination thereof, the visiting domain being a domain that the first malware sample attempted to send information thereto and/or receive information therefrom, and the visiting IP address being an IP address that the first malware sample attempted to send information thereto and/or receive information therefrom; obtain a second malware sample for analysis; determine whether a second source of the second malware sample is associated with malware based on the first cluster, comprising; traverse the directed graph to determine whether the second malware sample associated with the second source is associated with the first cluster, comprising to: determine whether the second source is associated with at least one source of the first cluster of source information, comprising to:  determine whether an edge of the set of edges is connected to the second source of the second malware sample, the second source having an association with a) a source domain, a source Internet Protocol (IP) address, or a combination thereof, and b) a visiting domain, a visiting IP address, or a combination thereof; and  in the event that the edge of the set of edges is connected to the second source, determine that the second source is associated with at least one source of the first cluster of source information; and in the event that the second malware sample is associated with the first cluster: extract a signature from the second malware sample; store the extracted signature in a database; and send the extracted signature to a security device; and a memory coupled to the processor and configured to provide the processor with instructions. 2. The system recited in claim 1 , wherein the first cluster associates related Internet Protocol (IP) address information and related domain information with the first malware sample. 3. The system recited in claim 1 , wherein the first cluster is generated using a searchable graph that associates related Internet Protocol (IP) address information and related domain information with the first malware sample. 4. The system recited in claim 1 , wherein the processor is further configured to: determine a domain is associated with a malware family based on an association with the first cluster. 5. The system recited in claim 1 , wherein the processor is further configured to: determine an Internet Protocol (IP) address is associated with a malware family based on an association with the first cluster. 6. The system recited in claim 1 , wherein the directed graph further includes the first malware sample having an association with an IP address resolved from a domain associated with the directed graph. 7. A method of malware detection using clustering with malware source information, comprising: generating, using a hardware processor, a first cluster of source information associated with a first malware sample, wherein the first malware sample was determined to be malware, wherein the first malware sample was determined to be downloaded from a first source, and wherein the generating of the first cluster of source information comprises: generating a directed graph associating a plurality of source information with the first malware sample to generate the first cluster of source information, the directed graph including a set of nodes and a set of edges, at least one node of the set of nodes relating to source information of a known malware, at least one edge of the set of edges representing a relationship based a first node of the set of nodes and a second node of the set of nodes, source information of the at least one node includes a domain, an IP address, or a combination thereof, the generating of the first cluster of source information being based at least in part on a clustering algorithm, the clustering algorithm including a recursive algorithm to find samples and domains that are correlated, the directed graph including: source information of the first malware sample including at least one of a) a source domain, a source Internet Protocol (IP) address, or a combination thereof, and b) a visiting domain, a visiting IP address, or a combination thereof, the visiting domain being a domain that the first malware sample attempted to send information thereto and/or receive information therefrom, and the visiting IP address being an IP address that the first malware sample attempted to send information thereto and/or receive information therefrom; obtaining a second malware sample for analysis; determining, using the hardware processor, whether a second source of the second malware sample is associated with malware based on the first cluster, comprising; traversing the directed graph to determine whether the second malware sample with the second source is associated with the first cluster, comprising; determining whether the second source is associated with at least one source of the first cluster of source information, comprising: determining whether an edge of the set of edges is connected to the second source of the second malware sample, the second source having an association with a) a source domain, a source Internet Protocol (IP) address, or a combination thereof, and b) a visiting domain, a visiting IP address, or a combination thereof; and in the event that the edge of the set of edges is connected to the second source, determining that the second source is associated with at least one source of the first cluster of source information; and in the event that the second malware sample is associated with the first cluster: extracting a signature from the second malware sample; storing the extracted signature in a database; and sending the extracted signature to a security device. 8. A computer program product for malware detection using clustering with malware source information, the computer program product being embodied in a tangible non-transitory computer readable storage medium and comprising computer instructions for: generating a first cluster of source information associated with a first malware sample, wherein the first malware sample was determined to be malware, wherein the first malware sample was determined to be downloaded from a first source, and wherein the generating of the first cluster of source information comprises: generating a

Assignees

Inventors

Classifications

  • for detecting or protecting against malicious traffic · CPC title

  • G06F21/56Primary

    Computer malware detection or handling, e.g. anti-virus arrangements · CPC title

  • eliminating virus, restoring damaged files · CPC title

  • Countermeasures against malicious traffic (countermeasures against attacks on cryptographic mechanisms H04L9/002) · CPC title

  • by monitoring network traffic (monitoring network traffic per se H04L43/00) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9710646B1 cover?
Techniques for malware detection using clustering with malware source information are disclosed. In some embodiments, malware detection using clustering with malware source information includes generating a first cluster of source information associated with a first malware sample, in which the first malware sample was determined to be malware, and the first malware sample was determined to be …
Who is the assignee on this patent?
Palo Alto Networks Inc
What technology area does this patent fall under?
Primary CPC classification G06F21/56. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 18 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).