Identifying malicious communication channels in network traffic by generating data based on adaptive sampling

US10440035B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10440035-B2
Application numberUS-201514955480-A
CountryUS
Kind codeB2
Filing dateDec 1, 2015
Priority dateDec 1, 2015
Publication dateOct 8, 2019
Grant dateOct 8, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Identifying malicious communications by generating data representative of network traffic based on adaptive sampling includes, at a computing device having connectivity to a network, obtaining a set of data flows representing network traffic between one or more nodes in the network and one or more domains outside of the network, wherein each data flow in the set of data flows includes a plurality of data packets. One or more features are extracted from the set of data flows based on statistical measurements of the set of data flows. The set of data flows are adaptively sampled based on at least the one or more features. Then, data representative of the network traffic is generated based on the adaptively sampling to identify malicious communication channels in the network traffic.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: at a computing device having connectivity to a network, obtaining a set of data flows representing network traffic between one or more nodes in the network and one or more domains outside of the network, each data flow in the set of data flows including a plurality of data packets; extracting one or more features from the set of data flows based on statistical measurements of the set of data flows; analyzing the one or more features extracted from a plurality of data flows in the set of data flows to identify statistically rare data flows in the set of data flows; subsequent to the analyzing, adaptively sampling the set of data flows by selecting specific whole flows included in the set of data flows based on the analyzing, wherein the selecting generates an enriched random sample of the set of data flows by deliberately skewing a distribution of a random sample to cover the statistically rare data flows included in the set of data flows; and generating data representative of the network traffic based on the adaptively sampling to identify malicious communication channels in the network traffic. 2. The method of claim 1 , wherein the malicious communication channels are associated with a command and control network. 3. The method of claim 1 , wherein the generating data further comprises: generating a first set of communication mappings representative of network traffic for a first user in the network; generating a second set of communication mappings representative of network traffic for a second user in the network; and combining the first set of communication mappings and the second set of communication mappings when the first set of communication mappings is related to the second set of communication mappings. 4. The method of claim 3 , wherein the first set of communication mappings is related to the second set of communication mappings when the first set of communication mappings and the second set of communication mappings have a predetermined number of the malicious communication channels in common. 5. The method of claim 1 , wherein the statistical measurements comprise at least one of: source Internet Protocol (IP) address of the data flow, destination IP address of the data flow, source port of the data flow, destination port of the data flow, protocol of the data flow, number of data packets transferred in the data flow, and timestamp of the data flow. 6. The method of claim 1 , wherein the one or more features comprise: one or more count features that indicate a number of data flows that are related based on the statistical measurements. 7. The method of claim 1 , wherein the one or more features comprise: one or more entropy features that indicate entropy of a statistical measurement over the set of data flows. 8. A system comprising: a network including a plurality of nodes; and a computing device having connectivity to the network and configured to: obtain a set of data flows representing network traffic between one or more nodes in the network and one or more domains outside of the network, each data flow in the set of data flows including a plurality of data packets; extract one or more features from the set of data flows based on statistical measurements of the set of data flows; analyze the one or more features extracted from a plurality of data flows in the set of data flows to identify statistically rare data flows in the set of data flows; subsequent analysis of the one or more features, adaptively sample the set of data flows by selecting specific whole flows included in the set of data flows based on the analysis of the one or more features, wherein the selecting generates an enriched random sample of the set of data flows by deliberately skewing a distribution of a random sample to cover the statistically rare data flows included in the set of data flows; and generate data representative of the network traffic based on the adaptively sampling to identify malicious communication channels in the network traffic. 9. The system of claim 8 , wherein the malicious communication channels are associated with a command and control network. 10. The system of claim 8 , wherein the computing device is further configured to: generate a first set of communication mappings representative of network traffic for a first user in the network; generate a second set of communication mappings representative of network traffic for a second user in the network; and combine the first set of communication mappings and the second set of communication mappings when the first set of communication mappings is related to the second set of communication mappings. 11. The system of claim 10 , wherein the first set of communication mappings is related to the second set of communication mappings when the first set of communication mappings and the second set of communication mappings share a predetermined number of the malicious communication channels. 12. The system of claim 8 , wherein the statistical measurements comprise at least one of: source Internet Protocol (IP) address of the data flow, destination IP address of the data flow, source port of the data flow, destination port of the data flow, protocol of the data flow, number of data packets transferred in the data flow, and timestamp of the data flow. 13. The system of claim 8 , wherein the one or more features comprise: one or more count features that indicate a number of data flows that are related based on the statistical measurements. 14. The system of claim 8 , wherein the one or more features comprise: one or more entropy features that indicate entropy of a certain statistical measurement over the set of data flows. 15. A non-transitory computer-readable storage media encoded with software comprising computer executable instructions and when the software is executed operable to: obtain a set of data flows representing network traffic between one or more nodes in a network and one or more domains outside of the network, each data flow in the set of data flows including a plurality of data packets; extract one or more features from the set of data flows based on statistical measurements of the set of data flows; analyze the one or more features extracted from a plurality of data flows in the set of data flows to identify statistically rare data flows in the set of data flows; subsequent analysis of the one or more features, adaptively sample the set of data flows by selecting specific whole flows included in the set of data flows based on the analysis of the one or more features, wherein the selecting generates an enriched random sample of the set of data flows by deliberately skewing a distribution of a random sample to cover the statistically rare data flows included in the set of data flows; and generate data representative of the network traffic based on the adaptively sampling to identify malicious communication channels in the network traffic. 16. The non-transitory computer-readable storage media of claim 15 , wherein the malicious communication channels are associated with a command and control network. 17. The non-transitory computer-readable storage media of claim 15 , wherein the instructions operable to generate further comprise instructions operable to: generate a first set of communication mappings representative of network traffic for a first user in the network; generate a second set of communication mappings representative of network traffic for a second user in the network; and combine the first set of communication mappings

Assignees

Inventors

Classifications

  • by adaptive sampling · CPC title

  • Event detection, e.g. attack signature detection · CPC title

  • Detection or countermeasures against botnets · CPC title

  • Filtering by address, protocol, port number or service, e.g. IP-address or URL · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10440035B2 cover?
Identifying malicious communications by generating data representative of network traffic based on adaptive sampling includes, at a computing device having connectivity to a network, obtaining a set of data flows representing network traffic between one or more nodes in the network and one or more domains outside of the network, wherein each data flow in the set of data flows includes a plurali…
Who is the assignee on this patent?
Cisco Tech Inc
What technology area does this patent fall under?
Primary CPC classification H04L63/1416. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Oct 08 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).