System, method, and computer program for detecting and measuring changes in network behavior of communication networks utilizing real-time clustering algorithms
US-9729571-B1 · Aug 8, 2017 · US
US2022329504A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2022329504-A1 |
| Application number | US-202217846908-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jun 22, 2022 |
| Priority date | Jun 22, 2020 |
| Publication date | Oct 13, 2022 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Disclosed are a network traffic classification method and system based on an improved K-means algorithm. The method comprises: judging whether a total number NIC of network traffic data points in an initial clustering center set reaches an expected number k of network traffic clusters, if the k is not reached, calculating candidate metric values of network traffic data points in a high-density network traffic data point set, selecting a network traffic data point having the maximum candidate metric value, adding same into an initial clustering center set, removing the network traffic data point from the high-density network traffic data point set, then repeating the step until the total number NIC of network traffic data points in the initial clustering center set reaches the k, and ending the step. The method and system can ensure high network traffic classification accuracy.
Opening claim text (preview).
1 . A network traffic classification method based on an improved K-means algorithm, comprising the following steps of: step 1: defining a number of network traffic data points as N; defining a Euclidean distance between a network traffic data point which is n th closest to an i th network traffic data point and the i th network traffic data point as an n th distance of the i th network traffic data point, wherein i=1, 2, 3, . . . , N; defining a distribution density of all network traffic data points from the closest to the n th closest to the i th network traffic data point in a multi-dimensional hypersphere with the i th network traffic data point as a spherical center and the n th distance of the i th network traffic data point as a radius r as an n th density D in of the i th network traffic data point, wherein one and only one network traffic data point which is closest to the i th network traffic data point and one and only one network traffic data point which is n th closest to the i th network traffic data point are provided respectively, and all network traffic data points comprise the network traffic data point closest to the i th network traffic data point and the network traffic data point n th closest to the i th network traffic data point; inputting a network traffic data point set to be clustered and an expected number k of network traffic clusters; determining a specific value of n in the n th distance of the i th network traffic data point; calculating an average value avg of n th densities of all network traffic data points; adding the network traffic data points in all network traffic data points with the n th density greater than avg into a high-density network traffic data point set, wherein the high-density is greater than the average density avg; and selecting a network traffic data point having the maximum n th density in the high-density network traffic data point set, adding same into an initial cluster center set, and removing the network traffic data point from the high-density network traffic data point set; and step 2: judging whether a total number NIC of network traffic data points in the initial clustering center set reaches the expected number k of network traffic clusters, if the k is not reached, calculating candidate metric values of the network traffic data points in the high-density network traffic data point set, selecting a network traffic data point having the maximum candidate metric value, adding same into the initial clustering center set, removing the network traffic data point from the high-density network traffic data point set, then repeating step 2 until the total number NIC of network traffic data points in the initial clustering center set reaches the k, and ending the step. 2 . The network traffic classification method based on the improved K-means algorithm according to claim 1 , wherein a calculation formula of the n th density D in of the i th network traffic data point is that D i n = n - 0.5 r + 1 . 3 . The network traffic classification method based on the improved K-means algorithm according to claim 1 , wherein a calculation formula of n in the n th distance of the i th network traffic data point is that n = N k × 8 . 4 . The network traffic classification method based on the improved K-means algorithm according to claim 2 , wherein a calculation formula of the average value avg of the n th densities of all network traffic data points is that a v g = 1 N ∑ i = 1 N D i n . 5 . The network traffic classification method based on the improved K-means algorithm according to claim 1 , wherein a candidate metric value of a j th network traffic data point in the high-density network traffic data point set is recorded as cd j , and a calculation formula thereof is that cd j =min(<A j , ic 1 >, <A j , ic 2 >, . . . , <A j , ic NIC >), wherein, A j is the j th network traffic data point in the high-density network traffic data point set, j=1, 2, 3, . . . , NHD, and NHD is a total number of network traffic data points in the high-density network traffic data point set, ic 1 , ic 2 , . . . , ic NIC are respectively first, second, . . . , NIC th network traffic data points in the initial clustering center set, <A j , ic 1 > is a Euclidean distance between the j th network traffic data point in the high-density network traffic data point set and the first network traffic data point in the initial clustering center set, and so on, <A j , ic NIC > is a Euclidean distance between the j th network traffic data point in the high-density network traffic data point set and the NIC th network traffic data point in the initial clustering center set. 6 . A network traffic classification system based on an improved K-means algorithm, comprising a processor and a storage medium, wherein: the storage medium is configured for storage instructions; and the processor is configured for operating according to the instructions to perform a network traffic classification method based on an improved K-means algorithm, the method comprising the following steps of: step 1: defining a number of network traffic data points as N; defining a Euclidean distance between a network traffic data point which is n th closest to an i th network traffic data point and the i th network traffic data point as an n th distance of the i th network traffic data point, wherein i=1, 2, 3, . . . , N; defining a distribution density of all network traffic data points from the closest to the n th closest to the i th network traffic data point in a multi-dimensional hypersphere with the i th network traffic data point as a spherical center and the n th distance of the i th network traffic data point as a radius r as an n th density D in of the i th network traffic data point, wherein one and only one network traffic data point which is closest to the i th network traffic data point and one and only one network traffic data point which is n th closest to the i th network traffic data point are provided respectively, and all network traffic data points compris
with fixed number of clusters, e.g. K-means clustering · CPC title
Classification techniques · CPC title
Network utilisation, e.g. volume of load or congestion level · CPC title
related to network traffic · CPC title
using statistical or mathematical methods · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.