Who is the assignee on this patent?

Univ Science & Technology China

What technology area does this patent fall under?

Primary CPC classification H04L63/1416. Mapped technology areas include Electricity.

When was this patent published?

Publication date Tue Dec 31 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Method and device for detecting security based on machine learning in combination with rule matching

US12184672B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12184672-B2
Application number	US-202017761861-A
Country	US
Kind code	B2
Filing date	Mar 18, 2020
Priority date	Oct 28, 2019
Publication date	Dec 31, 2024
Grant date	Dec 31, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for detecting security based on machine learning in combination with rule matching is provided, including: establishing a machine learning model; training the machine learning model by using a labeled legal traffic and a labeled malicious traffic; collecting a network traffic; preprocessing the collected network traffic; detecting a malicious traffic from the preprocessed network traffic by using a rule-matching-based method; identifying a malicious traffic from the preprocessed network traffic by using the trained machine learning model, including: extracting a feature of the preprocessed network traffic, and identifying the malicious traffic based on the extracted feature by using the trained machine learning model; and integrating the malicious traffic detected by the rule-matching-based method and the malicious traffic identified by the trained machine learning model.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for detecting security based on machine learning in combination with rule matching, comprising: establishing a machine learning model; training the machine learning model by using a labeled legal traffic and a labeled malicious traffic; collecting a network traffic; preprocessing the collected network traffic; detecting a malicious traffic from the preprocessed network traffic by using a rule-matching-based method; identifying a malicious traffic from the preprocessed network traffic by using the trained machine learning model, comprising: extracting a feature of the preprocessed network traffic, and identifying the malicious traffic based on the extracted feature by using the trained machine learning model; and integrating the malicious traffic detected by the rule-matching-based method and the malicious traffic identified by the trained machine learning model, wherein the method further comprises: sampling the collected network traffic according to a specified sampling rule, and the preprocessing the collected network traffic further comprises: preprocessing the sampled network traffic; and wherein the method further comprises: displaying an integrated result through a visualization technology, wherein the sampling the collected network traffic comprises: sampling the collected network traffic by using a flexible sampling algorithm; wherein the flexible sampling algorithm is a data stream record selection algorithm depending on a size of the data stream; for a data stream set S={X 1 , . . . , X n } with a size n, a data stream x i ′ with a size x i is selected from each X i through the flexible sampling algorithm with a probability P(x i ), i=1, . . . , n, so as to form a new data stream set S′={x i ′, . . . , x n ′}; the flexible sampling algorithm aims to make a total number of byte X′=Σ x i ′∈s′ , x i ′/P(x i ) calculated by sampling approach a total number of byte X=Σ x i ∈s X i of a real traffic; where i=1, . . . , n. 2. The method according to claim 1 , wherein the training the machine learning model by using a labeled legal traffic and a labeled malicious traffic comprises: extracting a time-based feature, a network-layer-based feature and a TTL-based feature from the labeled legal traffic and the labeled malicious traffic; and training the machine learning model based on the extracted features; and wherein the method further comprises: verifying the trained machine learning model by using a verification data set. 3. The method according to claim 1 , wherein the network traffic is collected by a GPU, and a data packet in the network traffic is directly copied from a cache queue of a network card to a user space based on a zero copy technology by using a direct memory access structure. 4. The method according to claim 1 , wherein the preprocessing the collected network traffic further comprises: performing a data packet reassembling, a protocol decoding and/or an anomaly detection on a data packet in the collected network traffic; wherein the data packet reassembling comprises a stream reassembling and a fragment reassembling, the protocol decoding is to decode a protocol of the data packet into a unified format, and the anomaly detection at least comprises a port scanning; and wherein a result of the preprocessing is a data after the packet reassembling and the protocol decoding in response to the data packet passing the anomaly detection; and an alarm is generated in response to the data packet failing the anomaly detection. 5. The method according to claim 1 , wherein the detecting a malicious traffic from the preprocessed network traffic by using a rule-matching-based method comprises: detecting the malicious traffic by using a PFAC algorithm; wherein a separate thread is created for each byte of an input data stream through the PFAC algorithm, so as to identify a mode starting from a starting position of the thread, and the number of created thread equals to a length of the input data stream; wherein each thread of the PFAC algorithm is only responsible for identifying the mode starting from the starting position of the thread, and terminating in response to the thread failing to find any mode located at the starting position of the thread, without a fault transition by a backtracking state machine; each final state of the PFAC algorithm represents a specified mode, so that a uniqueness of the each final state in the PFAC is maintained without processing a plurality of outputs; wherein a payload of the data stream is matched and verified with a plurality of rules in a rule set of an intrusion detection at the same time and in parallel through the PFAC algorithm, and in response to a match existing, the data stream is marked as the malicious traffic and an alarm is triggered. 6. The method according to claim 1 , wherein the extracting a feature of the preprocessed network traffic comprises: extracting a source port, a source address, a destination port, a destination address, an ICMP type, a protocol identifier, an original data length and an original data. 7. The method according to claim 6 , wherein the extracting a feature of the preprocessed network traffic comprises: implementing a hash table in a GPU, wherein the hash table is used to maintain and track an index of a feature data of each active traffic in the network traffic, and a specified hash value for each data unit is used to determine a specified data stream; wherein an atomic lock is used on each mutually exclusive hash entry, so that only one thread is allowed to update a hash entry of the thread at each moment; a data stream corresponding to a feature data becomes inactive in response to the feature data being transmitted, so as to trigger an operation of deleting the feature data corresponding to the data stream from the hash table; and for each data stream in the network traffic, a moment of a last-arrived packet is recorded in the hash table, wherein a threshold-based method is used to determine an inactive data stream, the threshold-based method comprises: determining that the feature data corresponding to the data stream is inactive in response to a time interval exceeding a threshold; wherein a feature data of the inactive data stream is output by providing a timing task, and the trained machine learning model is used for classifying based on the feature data. 8. The method according to claim 1 , wherein the steps of establishing and training the machine learning model are performed offline, and the steps of the collecting, preprocessing, detecting, identifying and integrating are performed online. 9. A device for detecting security based on machine learning in combination with rule matching, comprising: a processor; and a memory storing instructions, wherein the instructions, when executed by the processor, cause the processor to: establish a machine learning model; train the machine learning model by using a labeled legal traffic and a labeled malicious traffic; collect a network traffic; preprocess the collected network traffic; detect a malicious traffic from the preprocessed network traffic by using a rule-matching-based method; identify a malicious traffic from the preprocessed network traffic by using the trained machine learning model, comprising: extract a feature of the preprocessed network traffic, and identify the malicious traffic based on the extracted feature by using the trained machine learning model; and integrate the malicious traffic detected by the rule-matching-based method and the malicious traffic identified by the trained machine learning model, wherein the instructions, when executed by the processor, further cause the processor to: sampl

Assignees

Univ Science & Technology China

Inventors

Classifications

G06N20/00
Machine learning · CPC title
G06N5/04
Inference or reasoning models · CPC title
G06N20/20
Ensemble learning · CPC title
H04L63/166
at the transport layer · CPC title
H04L41/145
involving simulating, designing, planning or modelling of a network · CPC title

Patent family

Related publications grouped by family.

View patent family 69280495

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12184672B2 cover?: A method for detecting security based on machine learning in combination with rule matching is provided, including: establishing a machine learning model; training the machine learning model by using a labeled legal traffic and a labeled malicious traffic; collecting a network traffic; preprocessing the collected network traffic; detecting a malicious traffic from the preprocessed network traff…
Who is the assignee on this patent?: Univ Science & Technology China
What technology area does this patent fall under?: Primary CPC classification H04L63/1416. Mapped technology areas include Electricity.
When was this patent published?: Publication date Tue Dec 31 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Threat detection and mitigation in a virtualized computing environment

Systems and methods for network monitoring

Apparatus and method for detecting malicious domain cluster

Frequently asked questions