Traffic detection method and traffic detection device
US-2020322237-A1 · Oct 8, 2020 · US
US12039422B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12039422-B2 |
| Application number | US-202117162669-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 29, 2021 |
| Priority date | Jul 30, 2018 |
| Publication date | Jul 16, 2024 |
| Grant date | Jul 16, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Embodiments of this application disclose a method and an apparatus for generating an application identification model. The method includes: obtaining Y data packets, where the Y data packets correspond to P applications, an i th application in the P applications corresponds to M(i) data packets, and Y=Σ i=1 P M(i), 1≤i≤P; extracting a target parameter of each of the M(i) data packets of each of the P applications to obtain M(i) samples, where the target parameter indicates information about a session connection established between the i th application and a server that provides the i th application; and training an initial identification model based on the M(i) samples of each of the P applications, to obtain a first application identification model, where the first application identification model is used to determine, based on a target parameter of a data packet, an application corresponding to the data packet.
Opening claim text (preview).
What is claimed is: 1. A method, comprising: obtaining Y data packets, wherein the Y data packets correspond to P applications, for each integer value of i from 1 to P, an i th application in the P applications corresponds to M(i) data packets of the Y data packets, and Y=Σ i=1 P M(i); extracting a first target parameter of each of the M(i) data packets of each of the P applications, to obtain M(i) samples of each of the P applications, wherein for each sample of the M(i) samples of each of the P applications, the first target parameter of the respective sample indicates information about a session connection established between the i th application corresponding to the respective sample and a server that provides the i th application corresponding to the respective sample; and training an initial identification model based on the M(i) samples of each of the P applications, to obtain a first application identification model, wherein the first application identification model is usable to determine, based on a second target parameter of a data packet, an application corresponding to the data packet. 2. The method according to claim 1 , wherein after training the initial identification model based on the M(i) samples of each of the P applications, to obtain the first application identification model, the method further comprises: for each integer value of i from 1 to P, collecting N(i) data packets corresponding to the i th application; extracting a third target parameter of each of the N(i) data packets of each of the P applications, to obtain N(i) samples of each of the P applications; identifying the N(i) samples of each of the P applications using the first application identification model, to obtain an identification result of each of the P applications, wherein, for each integer value of i from 1 to P, the identification result of the i th application indicates that I(i) samples correspond to the i th application, I(i) is a positive integer, and I(i) is less than or equal to N; for each integer value of i from 1 to P, determining whether a ratio of I(i) to N(i) is greater than a first threshold; when each ratio of I(i) to N(i) of the P applications is greater than the first threshold, storing the first application identification model; and when at least one ratio of I(i) to N(i) of the P applications is less than the first threshold, adjusting the initial identification model, and training the adjusted initial identification model based on the M(i) samples of each of the P applications, to generate a second application identification model, wherein identification accuracy of the second application identification model is greater than identification accuracy of the first application identification model. 3. The method according to claim 2 , wherein after extracting the first target parameter of each of the M(i) data packets of each of the P applications, to obtain the M(i) samples of each of the P applications, the method further comprises: for each of the P applications, determining whether the M(i) samples of the respective application are less than a second threshold, wherein the second threshold is used to determine whether a new sample needs to be added to the respective application; and wherein training the initial identification model based on the M(i) samples of each of the P applications, to obtain the first application identification model, comprises, for each integer value of i from 1 to P: when M(i) of the i th application is less than the second threshold, obtaining a quantity X(i) of to-be-added samples of the i th application, wherein X(i) is a positive integer; generating X(i) new samples of the i th application based on the M(i) samples of the i th application; and training the initial recognition model based on the M(i) samples of the i th application and the X(i) new samples, to obtain the first application recognition model; or when M(i) of the i th application is greater than the second threshold, training the initial identification model based on the M(i) samples of the i th application, to obtain the first application identification model. 4. The method according to claim 3 , wherein generating the X(i) new samples of the i th application based on the M(i) samples of the i th application comprises: generating X(i)/M(i) new samples for each sample of the M(i) samples of the i th application, to obtain the X(i) new samples of the i th application; wherein generating the X(i)/M(i) new samples for each sample of the M(i) samples of the i th application comprises: using each sample of the M(i) samples of the i th application as a reference sample; obtaining, from the M(i) samples of the i th application, X(i)/M(i) samples having a minimum hamming distance from a reference sample of the reference samples; and generating, based on the reference samples and the X(i)/M(i) samples, X(i)/M(i) new samples corresponding to the reference samples. 5. The method according to claim 3 , wherein obtaining the quantity X(i) of to-be-added samples of the i th application comprises: obtaining a preset quantity X(i) of to-be-added samples of the i th application; obtaining a preset third threshold, and calculating a difference between the third threshold and M(i) of the i th application, to obtain the quantity X(i) of to-be-added samples of the i th application, wherein the third threshold indicates a minimum quantity of required samples for generating the application identification model; or obtaining the quantity X(i) of to-be-added samples of the i th application by: determining an average quantity Y/P of data packets of each of the P applications and an expected ratio R(i) of the i th application, wherein the expected ratio R(i) is used to indicate a ratio of an expected quantity E(i) of samples of the i th application to the average quantity Y/P of the data packets of each application; calculating the expected quantity E(i) of samples of the i th application based on the average quantity Y/P of the data packets of each of the P applications and the expected ratio R(i) of the i th application; and calculating a difference between E(i) and M(i) to obtain the quantity X(i) of to-be-added samples of the i th application. 6. The method according to claim 1 , wherein after training the initial identification model based on the M(i) samples of each of the P applications, to obtain the first application identification model, the method further comprises: obtaining a target data packet; extracting a third target parameter of the target data packet; and identifying the third target parameter of the target data packet by using the first application identification model, to obtain a target identification result, wherein the target identification result indicates an application corresponding to the target data packet. 7. The method according to claim 1 , wherein after extracting the first target parameter of each of the M(i) data packets of each of the P applications to obtain M(i) samples of each of the P applications, the method further comprises: for each of the P applications, determining whether M(i) of the respective application is less than a second threshold, wherein the second threshold is used to determine whether a new sample needs to be added to the respective application; and wherein training the initial identification model based on the M(i) samples of each of the P applications, to obtain the first application identification model, comprises, for each integer value of i from 1 to P: when M(i) of the i th application is less than the second threshold, obtaining a quantity X(i) of to-be-added samples of the i th application, wherein X(i) is a positive integer; generating X(i) new samples of the i th app
involving simulating, designing, planning or modelling of a network · CPC title
by sampling · CPC title
Filtering by address, protocol, port number or service, e.g. IP-address or URL · CPC title
Parsing or analysis of headers · CPC title
Machine learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.