Apparatus and method of constructing neural network translation model
US-2019114545-A1 · Apr 18, 2019 · US
US10764246B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10764246-B2 |
| Application number | US-201816220360-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 14, 2018 |
| Priority date | Aug 14, 2018 |
| Publication date | Sep 1, 2020 |
| Grant date | Sep 1, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A computer-implemented method for domain analysis comprises: obtaining, by a computing device, a domain; and inputting, by the computing device, the obtained domain to a trained detection model to determine if the obtained domain was generated by one or more domain generation algorithms. The detection model comprises a neural network model, a n-gram-based machine learning model, and an ensemble layer. Inputting the obtained domain to the detection model comprises inputting the obtained domain to each of the neural network model and the n-gram-based machine learning model. The neural network model and the n-gram-based machine learning model both output to the ensemble layer. The ensemble layer outputs a probability that the obtained domain was generated by the domain generation algorithms.
Opening claim text (preview).
The invention claimed is: 1. A computer-implemented method for domain analysis, comprising: obtaining, by a computing device, a domain; and inputting, by the computing device, the obtained domain to a trained detection model to determine if the obtained domain was generated by one or more domain generation algorithms, wherein: the detection model comprises a neural network model, a n-gram-based machine learning model, and an ensemble layer; inputting the obtained domain to the detection model comprises inputting the obtained domain to each of the neural network model and the n-gram-based machine learning model; the neural network model and the n-gram-based machine learning model both output to the ensemble layer; and the ensemble layer outputs a probability that the obtained domain was generated by the domain generation algorithms. 2. The method of claim 1 , wherein: obtaining, by the computing device, the domain comprises obtaining, by the computing device, the domain from a log of a local Domain Name Service (DNS) server; and the method further comprises forwarding, by the computing device, the determination to the local DNS server to block queries of the domain. 3. The method of claim 1 , wherein: obtaining, by the computing device, the domain comprises obtaining, by the computing device, the domain from an agent software installed on a client device; and the method further comprises forwarding, by the computing device, the determination to the agent software to block communications with an Internet Protocol (IP) address of the domain. 4. The method of claim 1 , wherein: obtaining, by the computing device, the domain comprises obtaining, by the computing device, the domain from a log of a network monitoring server; and the method further comprises forwarding, by the computing device, the determination to the network monitoring server to block queries of the domain. 5. The method of claim 1 , wherein: the detection model comprises an extra feature layer; inputting the obtained domain to the detection model comprises inputting the obtained domain to the extra feature layer; the extra feature layer outputs to the ensemble layer; the domain is associated with a domain name and a top-level domain (TLD); and the extra feature layer comprises at least of the following features: a length of the domain name, a length of the TLD, whether the length of the domain name exceeds a domain name threshold, whether the length of the TLD exceeds a TLD threshold, a number of numerical characters in the domain name, whether the TLD contains any numerical character, a number of special characters contained in the domain name, or whether the TLD contains any special character. 6. The method of claim 5 , wherein: the ensemble layer comprises a top logistic regression model outputting the probability; the top logistic regression model comprises a plurality of ensemble coefficients respectively associated with the features, the output from the neural network model, and the output from the n-gram-based machine learning model; and the detection model is trained by: training the neural network model and the n-gram-based machine learning model separately; and inputting outputs of the trained neural network model and the trained n-gram-based machine learning model to the top logistic regression model to solve the ensemble coefficients. 7. The method of claim 1 , wherein: the neural network model comprises a probability network; the domain is associated with a domain name, a top-level domain (TLD), and a domain length as separate inputs to the probability network; the domain name is inputted to a one-hot encoding layer and a recurrent neural network layer, before being inputted to a dense and batch normalization layer; the TLD is inputted to an embedding and batch normalization layer, before being inputted to the dense and batch normalization layer; the domain length is inputted to the dense and batch normalization layer; and the dense and batch normalization layer outputs a predicted probability that the obtained domain was generated by the domain generation algorithms. 8. The method of claim 7 , wherein: the recurrent neural network layer comprises long-short term memory (LSTM) units. 9. The method of claim 1 , wherein: the neural network model comprises a representation network; the domain is associated with a domain name and a top-level domain (TLD) as separate inputs to the representation network; the domain name is inputted to an embedding and batch normalization layer and a recurrent neural network layer, before being inputted to a dense and batch normalization layer; the TLD is inputted to an embedding and batch normalization layer, before being inputted to the dense and batch normalization layer; and the dense and batch normalization layer outputs a dense representation of the domain. 10. The method of claim 9 , wherein: the recurrent neural network layer comprises gated recurrent units (GRU). 11. The method of claim 1 , wherein: the n-gram-based machine learning model comprises a gradient boosting based classifier based on bigram features. 12. The method of claim 1 , wherein: the obtained domain comprises one or more Chinese Pinyin elements. 13. A system for domain analysis, comprising a processor and a non-transitory computer-readable storage medium storing instructions that, when executed by the processor, cause the processor to perform a method for domain analysis, the method comprising: obtaining a domain; and inputting the obtained domain to a trained detection model to determine if the obtained domain was generated by one or more domain generation algorithms, wherein: the detection model comprises a neural network model, a n-gram-based machine learning model, and an ensemble layer; inputting the obtained domain to the detection model comprises inputting the obtained domain to each of the neural network model and the n-gram-based machine learning model; the neural network model and the n-gram-based machine learning model both output to the ensemble layer; and the ensemble layer outputs a probability that the obtained domain was generated by the domain generation algorithms. 14. The system of claim 13 , wherein: obtaining the domain comprises obtaining the domain from a log of a local Domain Name Service (DNS) server; and the method further comprises forwarding the determination to the local DNS server to block queries of the domain. 15. The system of claim 13 , wherein: obtaining the domain comprises obtaining the domain from an agent software installed on a client device; and the method further comprises forwarding the determination to the agent software to block communications with an Internet Protocol (IP) address of the domain. 16. The system of claim 13 , wherein: obtaining the domain comprises obtaining the domain from a log of a network monitoring server; and the method further comprises forwarding the determination to the network monitoring server to block queries of the domain. 17. The system of claim 13 , wherein: the detection model comprises an extra feature layer; inputting the obtained domain to the detection model comprises inputting the obtained domain to the extra feature layer; the extra feature layer outputs to the ensemble layer; the domain is associated with a domain name and a top-level domain (TLD); and the extra feature layer comprises at least of the following features: a length of the domain name, a length of the TLD, whether the length of the domain name exceeds a domai
Probabilistic graphical models, e.g. probabilistic networks · CPC title
Domain name generation or assignment · CPC title
Probabilistic or stochastic networks · CPC title
Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title
Recurrent networks, e.g. Hopfield networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.