Method and system for detecting intrusion in parallel based on unbalanced data Deep Belief Network

US11977634B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11977634-B2
Application numberUS-202117626684-A
CountryUS
Kind codeB2
Filing dateMay 17, 2021
Priority dateJul 17, 2020
Publication dateMay 7, 2024
Grant dateMay 7, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The disclosure discloses a method for detecting an intrusion in parallel based on an unbalanced data Deep Belief Network, which reads an unbalanced data set DS; under-samples the unbalanced data set using the improved NCR algorithm to reduce the ratio of the majority type samples and make the data distribution of the data set balanced; the improved differential evolution algorithm is used on the distributed memory computing platform Spark to optimize the parameters of the deep belief network model to obtain the optimal model parameters; extract the feature of data of the data set, and then classify the intrusion detection by the weighted nuclear extreme learning machine, and finally train multiple weighted nuclear extreme learning machines of different structures in parallel by multithreading as the base classifier, and establish a multi-classifier intrusion detection model based on adaptive weighted voting for detecting the intrusion in parallel.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method for detecting an intrusion in parallel based on an unbalanced data Deep Belief Network (DBN), comprising: (1) obtaining an unbalanced data set, under-sampling the unbalanced data set using the Neighborhood Cleaning Rule algorithm, and clustering the under-sampled unbalanced data set using the Gravity-based Clustering Approach algorithm to obtain the clustered unbalanced data set; and (2) inputting the clustered unbalanced data obtained in step (1) into a trained Deep Belief Network model to extract a feature, inputting the extracted feature into multiple DBN-Weighted Kernel Extreme Learning Machine (DBN-WKELM) base classifiers in a trained DBN-WKELM multiclass classifier model to obtain multiple initial classification results, calculating a weight of each DBN-WKELM base classifier by the self-adaptive weighted voting method, and obtaining a final classification results and an intrusion behavioral type corresponding to the final classification results according to multiple weights and initial classification results. 2. The method for detecting the intrusion in parallel according to claim 1 , wherein step (1) comprises: (1-1) obtaining an unbalanced data set DS; (1-2) obtaining a sample point x and nearest neighbor data D k of the sample point x from the unbalanced data set DS obtained in step (1-1); wherein k represents a nearest neighbor parameter; (1-3) obtaining a set N k formed by all samples which have different type with the sample point x in the k-nearest neighbor data D k obtained in step (1-2) and a sample number num of the set N k ; (1-4) determining whether the sample number num obtained in step (1-3) is greater than or equal to k−1, proceeding to step (1-5) if yes, and otherwise, proceeding to step (1-6); (1-5) determining whether the type of the sample point x is a majority sample, updating the unbalanced data set DS to DS=DS−x and then proceeding to step (1-6) if yes, and otherwise, updating the unbalanced data set DS to DS=DS−N k and then proceeding to step (1-6); (1-6) repeating the above steps (1-2) to (1-5) for the remaining sample points in the unbalanced data set DS until all the sample points in the unbalanced data set DS has been processed, thereby obtaining the updated unbalanced data set DS; (1-7) setting a counter i=1; (1-8) determining whether i is equal to a total number of the sample points in the unbalanced data set DS, proceeding to step (1-14) if yes, and otherwise, proceeding to step (1-9); (1-9) reading the i-th new sample point d i =(d 1 i , d 2 i , . . . , d n i ) from the unbalanced data set DS updated in step (1-6), determining whether a cluster set S of preferred setting is empty, proceeding to step (1-10) if yes, and otherwise, proceeding to step (1-11); wherein e2∈[1, n], d e2 i : represents the e2-th feature attribute value in the i-th sample; (1-10) initialing a sample point d i as a new cluster C new ={d i }, setting a centroid μ to d i , adding the C new into the cluster set S, and proceeding to step (1-13); (1-11) calculating a gravitation for d i of each cluster in the cluster set S to obtain a gravitation set G={g 1 , g 2 , . . . , g ng }, and obtaining a maximum gravitation g max and its corresponding cluster C max from the gravitation set G, wherein n g represents a total number of the clusters in the cluster set S; (1-12) determining whether the maximum gravitation g max is less than a determined threshold r, returning to step (1-10) if yes, and otherwise, merging the sample point d i into the cluster C max , updating a centroid μ max of the cluster C max which has merged the sample point d i , and then proceeding to step (1-13); (1-13) setting the counter to i=i+1, and returning to step (1-8); and (1-14) traversing all the clusters in the cluster set S, and determining whether the types of all the sample points in each cluster is a majority sample, saving the majority samples in the cluster randomly according to a sampling rate sr and repeating the traversing process for the remaining clusters if yes, and otherwise, repeating the traversing process for the remaining clusters. 3. The method for detecting the intrusion in parallel according to claim 1 , wherein the gravitation g between the sample points d i and the clusters C is calculated according to the following formula: g = ln ⁢ ( ln ⁢ ( C num + 1 ) ) · ln ⁢ ( ln ⁢ 2 ) ( 1 n ⁢ ∑ e ⁢ 2 = 1 n ❘ "\[LeftBracketingBar]" d ϵ ⁢ 2 i - μ ε ⁢ 2 ❘ "\[RightBracketingBar]" ) 2 ; wherein C num is a number of the sample points in the cluster C, and μ e2 represents the e2-th feature attribute value in the centroid μ of the cluster C; a formula for updating a centroid μ max of a cluster C max is as follows: μ max =

Assignees

Inventors

Classifications

  • G06F21/566Primary

    Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities · CPC title

  • with fixed number of clusters, e.g. K-means clustering · CPC title

  • Combinations of networks · CPC title

  • Machine learning · CPC title

  • Test or assess a computer or a system · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11977634B2 cover?
The disclosure discloses a method for detecting an intrusion in parallel based on an unbalanced data Deep Belief Network, which reads an unbalanced data set DS; under-samples the unbalanced data set using the improved NCR algorithm to reduce the ratio of the majority type samples and make the data distribution of the data set balanced; the improved differential evolution algorithm is used on th…
Who is the assignee on this patent?
Univ Hunan
What technology area does this patent fall under?
Primary CPC classification G06F21/566. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 07 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).