Risk information output device, information output system, risk information output method, and recording medium
US-2024414180-A1 · Dec 12, 2024 · US
US10187412B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10187412-B2 |
| Application number | US-201514946156-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 19, 2015 |
| Priority date | Aug 28, 2015 |
| Publication date | Jan 22, 2019 |
| Grant date | Jan 22, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques are presented that identify malware network communications between a computing device and a server based on a cumulative feature vector generated from a group of network traffic records associated with communications between computing devices and servers. Feature vectors are generated, each vector including features extracted from the network traffic records in the group. A self-similarity matrix is computed for each feature which is a representation of the feature that is invariant to an increase or a decrease of feature values across all feature vectors in the group. Each self-similarity matrix is transformed into corresponding histograms to be invariant to a number of network traffic records in the group. The cumulative feature vector is a cumulative representation of the predefined set of features of all network traffic records included in the at least one group of network traffic records and is generated based on the corresponding histograms.
Opening claim text (preview).
What is claimed is: 1. A method comprising: at a networking device, dividing network traffic records to create at least one group of network traffic records, the at least one group including network traffic records being associated with network communications between a computing device and a server for a predetermined period of time; generating a set of feature vectors, each feature vector of the set of features vectors representing one of the network traffic records of the network communications included in the at least one group of network traffic records, wherein each feature vector comprises a predefined set of features extracted from one of the network traffic records; computing a self-similarity matrix for each feature of the predefined set of features using all feature vectors generated for the at least one group, each self-similarity matrix being a representation of one feature of the predefined set of features that is invariant to an increase or a decrease of values of the one feature across all of the feature vectors generated for the at least one group of network traffic records, each self-similarity matrix including a plurality of elements in rows and columns, wherein an (i, j)-th element of a self-similarity matrix corresponds to a distance between a feature value of an i-th network traffic record and a feature value of a j-th network traffic record; transforming each self-similarity matrix into a corresponding histogram to form a set of histograms, each histogram being a representation of the one feature that is invariant to a number of network traffic records in the at least one group of network traffic records; generating a cumulative feature vector based on the set of histograms, the cumulative feature vector being a cumulative representation of the predefined set of features of all network traffic records included in the at least one group of network traffic records; training a classifier based on the cumulative feature vector to produce a trained classifier; classifying, by the trained classifier, the at least one group as being malicious; and identifying a malware network communication between the computing device and the server utilizing the at least one classified group, wherein the cumulative feature vector enables detection of variations and modifications of the malware network communication. 2. The method of claim 1 , further comprising: transforming each self-similarity matrix into a locally-scaled self-similarity matrix, each locally-scaled self-similarity matrix being a representation of the one feature of the predefined set of features that is invariant to values of the one feature across all of the feature vectors being multiplied by a common factor. 3. The method of claim 1 , wherein generating the cumulative feature vector comprises concatenating the histograms in the set of histograms to form the cumulative feature vector. 4. The method of claim 1 , wherein the variations and modifications of the malware network communication include a variation in one or more of: a shift of the flow-based features, a scale of the flow-based features, a permutation of the flow-based features, a number of the flow-based features, or in a size of the at least one group of network traffic records, and further comprising transforming a representation of the at least one group of network traffic records to be invariant against the variations and modifications of the malware network communication. 5. The method of claim 1 , wherein the network traffic records include proxy logs and network flow reports, and wherein the predefined set of flow-based feature values includes values describing a structure of a Uniform Resource Locator (URL), a number of bytes transferred from the server to the computing device, a status of a user agent, a Hypertext Transfer Protocol (HTTP) status, a Multipurpose Internet Mail Extension (MIME) type, and a port value. 6. The method of claim 1 , wherein the self-similarity matrix is a symmetric positive semidefinite matrix in which the rows and columns represent individual network communications between the computing device and the server. 7. The method of claim 6 , further comprising: scaling all values in the self-similarity matrix into an interval [0,1] to produce scale invariance. 8. An apparatus comprising: one or more processors; one or more memory devices in communication with the one or more processors; and at least one network interface unit coupled to the one or more processors, wherein the one or more processors are configured to: divide network traffic records to create at least one group of network traffic records, the at least one group including network traffic records being associated with network communications between a computing device and a server for a predetermined period of time; generate a set of feature vectors, each feature vector of the set of feature vectors representing one of the network traffic records of the network communications included in the at least one group of network traffic records, wherein each feature vector comprises a predefined set of features extracted from one of the network traffic records; compute a self-similarity matrix for each feature of the predefined set of features using all feature vectors generated for the at least one group, each self-similarity matrix being a representation of one feature of the predefined set of features that is invariant to an increase or a decrease of values of the one feature across all of the feature vectors generated for the at least one group of network traffic records, each self-similarity matrix including a plurality of elements in rows and columns, wherein an (i, j)-th element of a self-similarity matrix corresponds to a distance between a feature value of an i-th network traffic record and a feature value of a j-th network traffic record; transform each self-similarity matrix into a corresponding histogram to form a set of histograms, each histogram being a representation of the one feature that is invariant to a number of network traffic records in the at least one group of network traffic records; generate a cumulative feature vector based on the set of histograms, the cumulative feature vector being a cumulative representation of the predefined set of features of all network traffic records included in the at least one group of network traffic records; train a classifier based on the cumulative feature vector to produce a trained classifier; classify, by the trained classifier, the at least one group as being malicious; and identify a malware network communication between the computing device and the server utilizing the at least one classified group, wherein the cumulative feature vector enables detection of variations and modifications of the malware network communication. 9. The apparatus of claim 8 , wherein the one or more processors are configured to: transform each self-similarity matrix into a locally-scaled self-similarity matrix, each locally-scaled self-similarity matrix being a representation of the one feature of the predefined set of features that is invariant to values of the one feature across all of the feature vectors being multiplied by a common factor. 10. The apparatus of claim 8 , wherein the one or more processors generate the cumulative feature vector by concatenating the histograms in the set of histograms to form the cumulative feature vector. 11. The apparatus of claim 8 , wherein the variations and modifications of the malware network communication include a variation in one or more of: a shift of the flow-based features, a scale of the flow-based features, a permutation of the flow-based features, a number of the
Traffic logging, e.g. anomaly detection · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.