Network traffic classification method and system based on improved K-means algorithm

US11570069B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11570069-B2
Application numberUS-202217846908-A
CountryUS
Kind codeB2
Filing dateJun 22, 2022
Priority dateJun 22, 2020
Publication dateJan 31, 2023
Grant dateJan 31, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed are a network traffic classification method and system based on an improved K-means algorithm. The method comprises: judging whether a total number NIC of network traffic data points in an initial clustering center set reaches an expected number k of network traffic clusters, if the k is not reached, calculating candidate metric values of network traffic data points in a high-density network traffic data point set, selecting a network traffic data point having the maximum candidate metric value, adding same into an initial clustering center set, removing the network traffic data point from the high-density network traffic data point set, then repeating the step until the total number NIC of network traffic data points in the initial clustering center set reaches the k, and ending the step. The method and system can ensure high network traffic classification accuracy.

First claim

Opening claim text (preview).

The invention claimed is: 1. A network traffic classification method based on an improved K-means algorithm, comprising the following steps of: defining a number of network traffic data points as N; defining a Euclidean distance between a network traffic data point which is n th closest to an i th network traffic data point and the i th network traffic data point as an n th distance of the i th network traffic data point, wherein i=1, 2, 3, . . . , N; defining a distribution density of all network traffic data points from the closest to the n th closest to the i th network traffic data point in a multi-dimensional hypersphere with the i th network traffic data point as a spherical center and the n th distance of the i th network traffic data point as a radius r as an n th density D in of the i th network traffic data point, wherein one and only one network traffic data point which is closest to the i th network traffic data point and one and only one network traffic data point which is n th closest to the i th network traffic data point are provided respectively, and all network traffic data points comprise the network traffic data point closest to the i th network traffic data point and the network traffic data point n th closest to the i th network traffic data point; inputting a network traffic data point set to be clustered and an expected number k of network traffic clusters; determining a specific value of n in the n th distance of the i th network traffic data point; calculating an average value avg of n th densities of all network traffic data points; adding the network traffic data points in all network traffic data points with the n th density greater than the average value avg of the n th densities into a high-density network traffic data point set, wherein the high-density is greater than the average value avg of the n th densities; and selecting network traffic data points having the maximum n th density in the high-density network traffic data point set, adding the network traffic data points having the maximum n th density into an initial cluster center set, and removing the network traffic data points from the high-density network traffic data point set until a total number NIC of network traffic data points in the initial cluster center set reaches the expected number k of network traffic clusters; clustering the initial cluster center set; establishing a mapping relation between a network traffic cluster obtained by the clustering and a network application type; and classifying the network traffic data point based on the mapping relation. 2. The network traffic classification method based on the improved K-means algorithm according to claim 1 , wherein a calculation formula of the n th density D in of the i th network traffic data point is that D i ⁢ n = n - 0.5 r + 1 . 3. The network traffic classification method based on the improved K-means algorithm according to claim 1 , wherein a calculation formula of n in the n th distance of the i th network traffic data point is that n = N k × 8 . 4. The network traffic classification method based on the improved K-means algorithm according to claim 2 , wherein a calculation formula of the average value avg of the n th densities of all network traffic data points is that a ⁢ v ⁢ g = 1 N ⁢ ∑ i = 1 N D i ⁢ n . 5. The network traffic classification method based on the improved K-means algorithm according to claim 1 , wherein a candidate metric value of a j th network traffic data point in the high-density network traffic data point set is recorded as cd j , and a calculation formula thereof is that cd j =min(<A j , ic 1 >, <A j , ic 2 >, . . . , <A j , ic NIC >), wherein, A j is the j th network traffic data point in the high-density network traffic data point set, j=1, 2, 3, . . . , NHD, and NHD is a total number of network traffic data points in the high-density network traffic data point set, ic 1 , ic 2 , . . . , ic NIC are respectively first, second, . . . , NIC th network traffic data points in the initial cluster center set, <A j , ic 1 > is a Euclidean distance between the j th network traffic data point in the high-density network traffic data point set and the first network traffic data point in the initial cluster center set, and so on, <A j , ic NIC > is a Euclidean distance between the j th network traffic data point in the high-density network traffic data point set and the NIC th network traffic data point in the initial cluster center set. 6. A network traffic classification system based on an improved K-means algorithm, comprising a processor and a storage medium, wherein: the storage medium is configured for storage instructions; and the processor is configured for operating according to the instructions to perform a network traffic classification method based on an improved K-means algorithm, the method comprising the following steps of: defining a number of network traffic data points as N; defining a Euclidean distance between a network traffic data point which is n th closest to an i th network traffic data point and the i th network traffic data point as an n th distance of the i th network traffic data point, wherein i=1, 2, 3, . . . , N; defining a distribution density of all network traffic data points from the closest to the n th closest to the i th network traffic data point in a multi-dimensional hypersphere with the i th network traffic data point as a spherical center and the n th distance of the i th network traffic data point as a radius r as an n th density D in of the i th network traffic data point, wherein one and only one network traffic data point which is closest to the i th network traffic data point and one and only one network traffic data point which is n th closest to the i th network traffic data point are provided respectively, and all network traffic data points comprise the network traffic data point closest to the i th network traffic data point and the network traffic data point n th closest to the i th network traffic data point;

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11570069B2 cover?
Disclosed are a network traffic classification method and system based on an improved K-means algorithm. The method comprises: judging whether a total number NIC of network traffic data points in an initial clustering center set reaches an expected number k of network traffic clusters, if the k is not reached, calculating candidate metric values of network traffic data points in a high-density …
Who is the assignee on this patent?
Univ Nanjing Posts & Telecommunications, Nanjing Univ Of Posts And Telecommunicatins
What technology area does this patent fall under?
Primary CPC classification H04L43/062. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Jan 31 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).