What technology area does this patent fall under?

Primary CPC classification H04L63/1425. Mapped technology areas include Electricity.

When was this patent published?

Publication date Tue Mar 20 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Learning detector of malicious network traffic from weak labels

US9923912B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9923912-B2
Application number	US-201514960086-A
Country	US
Kind code	B2
Filing date	Dec 4, 2015
Priority date	Aug 28, 2015
Publication date	Mar 20, 2018
Grant date	Mar 20, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques are presented that identify malware network communications between a computing device and a server utilizing a detector process. Network traffic records are classified as either malware or legitimate network traffic records and divided into groups of classified network traffic records associated with network communications between the computing device and the server for a predetermined period of time. A group of classified network traffic records is labeled as malicious when at least one of the classified network traffic records in the group is malicious and as legitimate when none of the classified network traffic records in the group is malicious to obtain a labeled group of classified network traffic records. A detector process is trained on individual classified network traffic records in the labeled group of classified network traffic records and network communication between the computing device and the server is identified as malware network communication utilizing the detector process.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: at a networking device, classifying network traffic records as either malware network traffic records or legitimate network traffic records, wherein a subset of the classified network traffic records is classified with flaws; dividing classified network traffic records into at least one group of classified network traffic records, the at least one group including classified network traffic records associated with network communications between a computing device and a server for a predetermined period of time; labeling the at least one group of classified network traffic records as malicious when at least one of the classified network traffic records in the at least one group is malicious or labeling the at least one group of classified network traffic records as legitimate when none of the classified network traffic records in the at least one group is malicious to obtain at least one labeled group of classified network traffic records; training a detector process on individual classified network traffic records in the at least one labeled group of classified network traffic records to learn a flow-level model based on the labeling of the at least one group of classified network traffic records, wherein the detector process is a Neyman-Pearson (NP) detector process combined with a Multi Instance Learning (MIL) algorithm; and identifying malware network communications between the computing device and the server utilizing the flow-level model of the detector process, wherein the NP detector process reduces a false negative rate of detection results to achieve a predetermined false positive rate of the detection results when identifying the malware network communication, and wherein the MIL algorithm reduces an impact of flawed classified network traffic records on an accuracy of the detector process in identifying malware network communication. 2. The method of claim 1 , wherein the network traffic records include proxy logs, and wherein the classifying comprises analyzing proxy log domains of the proxy logs to classify the network traffic records. 3. The method of claim 1 , wherein the classifying comprises classifying network traffic records based on blacklists, domain reputation, security reports and sandboxing analysis results. 4. The method of claim 3 , further comprising: repeatedly retraining the detector process based on updated blacklists, domain reputation, security reports and sandboxing analysis results. 5. The method of claim 1 , wherein a false positive rate of the detector process is determined by an instance that has a maximal distance from a malicious decision hyperplane. 6. The method of claim 1 , wherein the MIL algorithm reduces a weighted sum of errors made by the detector process on the at least one labeled group of classified network traffic records and allows the subset of the classified network traffic records that is classified with flaws. 7. The method of claim 1 , wherein training the detector process comprises: estimating a number of false positive detection results and a number of false negative detection results for results generated by the NP detector process; formulating a learning criterion for training the NP detector process and solving an optimization problem by using a parameter to weight the estimated numbers of the false positive detection results and the false negative detection results; randomly generating parameters for a stochastic gradient descent (SGD) function; repeatedly executing the SGD function using the randomly generated parameters thereby optimizing operating parameters of the NP detector process. 8. An apparatus comprising: one or more processors; one or more memory devices in communication with the one or more processors; and at least one network interface unit coupled to the one or more processors, wherein the one or more processors are configured to: classify network traffic records as either malware network traffic records or legitimate network traffic records, wherein a subset of the classified network traffic records is classified with flaws; divide classified network traffic records into at least one group of classified network traffic records, the at least one group including classified network traffic records associated with network communications between a computing device and a server for a predetermined period of time; label the at least one group of classified network traffic records as malicious when at least one of the classified network traffic records in the at least one group is malicious or label the at least one group of classified network traffic records as legitimate when none of the classified network traffic records in the at least one group is malicious to obtain at least one labeled group of classified network traffic records; train a detector process on individual classified network traffic records in the at least one labeled group of classified network traffic records to learn a flow-level model based on the labeling of the at least one group of classified network traffic records, wherein the detector process is a Neyman-Pearson (NP) detector process combined with a Multi Instance Learning (MIL) algorithm; and identify malware network communications between the computing device and the server utilizing the flow-level model of the detector process, wherein the NP detector process reduces a false negative rate of detection results to achieve a predetermined false positive rate of the detection results when identifying the malware network communication, and wherein the MIL algorithm reduces an impact of flawed classified network traffic records on an accuracy of the detector process in identifying malware network communication. 9. The apparatus of claim 8 , wherein the network traffic records include proxy logs, and wherein the one or more processors are configured to classify network traffic records by analyzing proxy log domains of the proxy logs to classify the network traffic records. 10. The apparatus of claim 8 , wherein the one or more processors are configured to classify network traffic records based on blacklists, domain reputation, security reports and sandboxing analysis results. 11. The apparatus of claim 10 , wherein the one or more processors are configured to: repeatedly retrain the detector process based on updated blacklists, domain reputation, security reports and sandboxing analysis results. 12. The apparatus of claim 8 , wherein a false positive rate of the detector process is determined by an instance that has a maximal distance from a malicious decision hyperplane. 13. The apparatus of claim 8 , wherein the MIL algorithm reduces a weighted sum of errors made by the detector process on the at least one labeled group of classified network traffic records and allows the subset of the classified network traffic records that is classified with flaws. 14. The apparatus of claim 8 , wherein the one or more processor is configured to train the detector process by: estimating a number of false positive detection results and a number of false negative detection results for results generated by the NP detector process; formulating a learning criterion for training the NP detector process and solving an optimization problem by using a parameter to weight the estimated numbers of the false positive detection results and the false negative detection results; randomly generating parameters for a stochastic gradient descent (SGD) function; repeatedly executing the SGD function using the randomly generated parameters thereby optimizing operating parameters of the

Assignees

Cisco Tech Inc

Inventors

Classifications

H04L63/0281
Proxies · CPC title
G06F21/53
by executing in a restricted environment, e.g. sandbox or secure virtual machine · CPC title
H04L63/1425Primary
Traffic logging, e.g. anomaly detection · CPC title

Patent family

Related publications grouped by family.

View patent family 58096262

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9923912B2 cover?: Techniques are presented that identify malware network communications between a computing device and a server utilizing a detector process. Network traffic records are classified as either malware or legitimate network traffic records and divided into groups of classified network traffic records associated with network communications between the computing device and the server for a predetermin…
Who is the assignee on this patent?: Cisco Tech Inc
What technology area does this patent fall under?: Primary CPC classification H04L63/1425. Mapped technology areas include Electricity.
When was this patent published?: Publication date Tue Mar 20 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Dynamic feature selection for joint probabilistic recognition

Multi-Channel Change-Point Malware Detection

Expert antenna control system

Apparatus and method for classifying data and system for collecting data

Methods, systems, and computer readable media for rapid filtering of opaque data traffic

Method and apparatus for classifying applications using the collective properties of network traffic in a traffic activity graph

Frequently asked questions