Network traffic classification method and system based on improved k-means algorithm

US2022329504A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2022329504-A1
Application numberUS-202217846908-A
CountryUS
Kind codeA1
Filing dateJun 22, 2022
Priority dateJun 22, 2020
Publication dateOct 13, 2022
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed are a network traffic classification method and system based on an improved K-means algorithm. The method comprises: judging whether a total number NIC of network traffic data points in an initial clustering center set reaches an expected number k of network traffic clusters, if the k is not reached, calculating candidate metric values of network traffic data points in a high-density network traffic data point set, selecting a network traffic data point having the maximum candidate metric value, adding same into an initial clustering center set, removing the network traffic data point from the high-density network traffic data point set, then repeating the step until the total number NIC of network traffic data points in the initial clustering center set reaches the k, and ending the step. The method and system can ensure high network traffic classification accuracy.

First claim

Opening claim text (preview).

1 . A network traffic classification method based on an improved K-means algorithm, comprising the following steps of: step 1: defining a number of network traffic data points as N; defining a Euclidean distance between a network traffic data point which is n th closest to an i th network traffic data point and the i th network traffic data point as an n th distance of the i th network traffic data point, wherein i=1, 2, 3, . . . , N; defining a distribution density of all network traffic data points from the closest to the n th closest to the i th network traffic data point in a multi-dimensional hypersphere with the i th network traffic data point as a spherical center and the n th distance of the i th network traffic data point as a radius r as an n th density D in of the i th network traffic data point, wherein one and only one network traffic data point which is closest to the i th network traffic data point and one and only one network traffic data point which is n th closest to the i th network traffic data point are provided respectively, and all network traffic data points comprise the network traffic data point closest to the i th network traffic data point and the network traffic data point n th closest to the i th network traffic data point; inputting a network traffic data point set to be clustered and an expected number k of network traffic clusters; determining a specific value of n in the n th distance of the i th network traffic data point; calculating an average value avg of n th densities of all network traffic data points; adding the network traffic data points in all network traffic data points with the n th density greater than avg into a high-density network traffic data point set, wherein the high-density is greater than the average density avg; and selecting a network traffic data point having the maximum n th density in the high-density network traffic data point set, adding same into an initial cluster center set, and removing the network traffic data point from the high-density network traffic data point set; and step 2: judging whether a total number NIC of network traffic data points in the initial clustering center set reaches the expected number k of network traffic clusters, if the k is not reached, calculating candidate metric values of the network traffic data points in the high-density network traffic data point set, selecting a network traffic data point having the maximum candidate metric value, adding same into the initial clustering center set, removing the network traffic data point from the high-density network traffic data point set, then repeating step 2 until the total number NIC of network traffic data points in the initial clustering center set reaches the k, and ending the step. 2 . The network traffic classification method based on the improved K-means algorithm according to claim 1 , wherein a calculation formula of the n th density D in of the i th network traffic data point is that D i ⁢ n = n - 0.5 r + 1 . 3 . The network traffic classification method based on the improved K-means algorithm according to claim 1 , wherein a calculation formula of n in the n th distance of the i th network traffic data point is that n = N k × 8 . 4 . The network traffic classification method based on the improved K-means algorithm according to claim 2 , wherein a calculation formula of the average value avg of the n th densities of all network traffic data points is that a ⁢ v ⁢ g = 1 N ⁢ ∑ i = 1 N D i ⁢ n . 5 . The network traffic classification method based on the improved K-means algorithm according to claim 1 , wherein a candidate metric value of a j th network traffic data point in the high-density network traffic data point set is recorded as cd j , and a calculation formula thereof is that cd j =min(<A j , ic 1 >, <A j , ic 2 >, . . . , <A j , ic NIC >), wherein, A j is the j th network traffic data point in the high-density network traffic data point set, j=1, 2, 3, . . . , NHD, and NHD is a total number of network traffic data points in the high-density network traffic data point set, ic 1 , ic 2 , . . . , ic NIC are respectively first, second, . . . , NIC th network traffic data points in the initial clustering center set, <A j , ic 1 > is a Euclidean distance between the j th network traffic data point in the high-density network traffic data point set and the first network traffic data point in the initial clustering center set, and so on, <A j , ic NIC > is a Euclidean distance between the j th network traffic data point in the high-density network traffic data point set and the NIC th network traffic data point in the initial clustering center set. 6 . A network traffic classification system based on an improved K-means algorithm, comprising a processor and a storage medium, wherein: the storage medium is configured for storage instructions; and the processor is configured for operating according to the instructions to perform a network traffic classification method based on an improved K-means algorithm, the method comprising the following steps of: step 1: defining a number of network traffic data points as N; defining a Euclidean distance between a network traffic data point which is n th closest to an i th network traffic data point and the i th network traffic data point as an n th distance of the i th network traffic data point, wherein i=1, 2, 3, . . . , N; defining a distribution density of all network traffic data points from the closest to the n th closest to the i th network traffic data point in a multi-dimensional hypersphere with the i th network traffic data point as a spherical center and the n th distance of the i th network traffic data point as a radius r as an n th density D in of the i th network traffic data point, wherein one and only one network traffic data point which is closest to the i th network traffic data point and one and only one network traffic data point which is n th closest to the i th network traffic data point are provided respectively, and all network traffic data points compris

Assignees

Inventors

Classifications

  • with fixed number of clusters, e.g. K-means clustering · CPC title

  • Classification techniques · CPC title

  • Network utilisation, e.g. volume of load or congestion level · CPC title

  • H04L43/062Primary

    related to network traffic · CPC title

  • H04L41/142Primary

    using statistical or mathematical methods · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2022329504A1 cover?
Disclosed are a network traffic classification method and system based on an improved K-means algorithm. The method comprises: judging whether a total number NIC of network traffic data points in an initial clustering center set reaches an expected number k of network traffic clusters, if the k is not reached, calculating candidate metric values of network traffic data points in a high-density …
Who is the assignee on this patent?
Univ Nanjing Posts & Telecommunications
What technology area does this patent fall under?
Primary CPC classification H04L43/062. Mapped technology areas include Electricity.
When was this patent published?
Publication date Thu Oct 13 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).