Detection and classification of exploit kits
US-9825976-B1 · Nov 21, 2017 · US
US10205704B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10205704-B2 |
| Application number | US-201615200530-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 1, 2016 |
| Priority date | Jul 1, 2016 |
| Publication date | Feb 12, 2019 |
| Grant date | Feb 12, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods and systems for classifying malicious locators. A processor is trained on a set of known malicious locators using a non-supervised learning procedure. Once trained, the processor may classify new locators as being generated by a particular generation kit.
Opening claim text (preview).
What is claimed is: 1. A method for classifying malicious locators accessible through a network, the method comprising: accessing, through an interface to a computer-readable medium, a plurality of locators, wherein each locator comprises the location of a malicious network-accessible resource; extracting at least one feature from each of the plurality of locators, wherein the at least one extracted feature is organized into a binary tree and selected based on a minimum calculated gini entropy; assigning a membership probability to each of the plurality of locators, the membership probability representing a probability a locator was generated by a specific family, wherein different families are generated by different kits; labeling each of the plurality of locators as being generated by a specific family and kit combination based on the at least one extracted feature and the assigned membership probability; providing the at least one extracted feature and the family and kit combination label for each of the plurality of locators to a classification module to train the classification module; and applying the classification module to a second locator to determine a family and kit source of the second locator. 2. The method of claim 1 , wherein at least one locator is a uniform resource locator (URL). 3. The method of claim 1 , wherein labeling each of the plurality of locators as being generated by a specific family and kit combination includes labeling each of the plurality of locators as being generated by a specific URL-generation kit. 4. The method of claim 1 , wherein the label assigned to each of the plurality of locators is based on a highest membership probability for each of the plurality of locators. 5. The method of claim 1 , wherein the at least one feature includes one or more of locator string length, character frequency distribution, domain levels, number of directories, number of words, number of words from a predetermined list of words, number of vowels, and number of consonants in the locator. 6. The method of claim 1 , further comprising producing weights from the classification module related to each of the at least one feature to assist in determining a family and kit combination for the second locator. 7. The method of claim 1 , further comprising issuing a message indicating the family and kit combination of the second locator. 8. The method of claim 1 , further comprising classifying the second locator as malicious or non-malicious. 9. A system for classifying malicious locators accessible through a network, the system comprising: an interface to a computer-readable medium configured to access a plurality of locators, each of the plurality of locators comprising the location of a malicious network-accessible resource; a network interface; and a processor in communication with the medium interface and the network interface, the processor configured to: extract at least one feature from each of the plurality of locators, wherein the at least one extracted feature is organized into a binary tree and selected based on a minimum calculated gini entropy; assign a membership probability to each of the plurality of locators, the membership probability representing a probability a locator was generated by a specific family, wherein different families are generated by different kits; label each of the plurality of locators as being generated by a specific family and kit combination based on the at least one extracted feature and the assigned membership probability; and provide the at least one extracted feature and the family and kit combination label for each of the plurality of locators to a classification module to train the classification module so the classification module can determine a family and kit source of a second locator. 10. The system of claim 9 , wherein the locator is a uniform resource locator (URL). 11. The system of claim 9 , wherein the processor is configured to label each of the plurality of locators as being generated by a specific URL-generation kit. 12. The system of claim 9 , wherein the label assigned to each of the plurality of locators is based on a highest membership probability for each of the plurality of locators. 13. The system of claim 9 , wherein the at least one feature includes one or more of locator string length, character frequency distribution, domain levels, number of directories, number of words, number of words from a predetermined list of words, number of vowels, and number of consonants in the locator. 14. The system of claim 9 , wherein the processor is configured to produce weights related to each of the at least one feature to assist in determining a family and kit combination for the second locator. 15. The system of claim 9 , wherein the processor is configured to issue a message indicating the family and kit combination of the second locator. 16. The system of claim 9 , wherein the processor is configured to classify the second locator as malicious or non-malicious. 17. The system of claim 9 , wherein the processor is further configured to assign weights to the second locator to determine a family the second locator belongs to and further configured to determine a locator generation kit that generated the second locator based on the family. 18. A computer readable medium containing computer-executable instructions for performing a method for classifying malicious locators accessible through a network, the medium comprising: computer-executable instructions for accessing, through an interface to a computer-readable medium, a plurality of locators, wherein each locator comprises the location of a malicious network-accessible resource; computer-executable instructions for extracting at least one feature from each of the plurality of locators, wherein the at least one extracted feature is organized into a binary tree and selected based on a minimum calculated gini entropy; computer-executable instructions for assigning a membership probability to each of the plurality of locators, the membership probability representing a probability a locator was generated by a specific family, wherein different families are generated by different kits; computer-executable instructions for labeling each of the plurality of locators as being generated by a specific family and kit combination based on the at least one extracted feature and the assigned membership probability; computer-executable instructions for providing the at least one extracted feature and the family and kit combination label for each of the plurality of locators to a classification module to train the classification module; and computer-executable instructions for applying the classification module to a second locator to determine a family and kit source of the second locator.
Filtering by address, protocol, port number or service, e.g. IP-address or URL · CPC title
Active attacks involving interception, injection, modification, spoofing of data unit addresses, e.g. hijacking, packet injection or TCP sequence number attacks · CPC title
Traffic logging, e.g. anomaly detection · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.