Identifying algorithmically generated domains

US2016352679A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2016352679-A1
Application numberUS-201514723102-A
CountryUS
Kind codeA1
Filing dateMay 27, 2015
Priority dateMay 27, 2015
Publication dateDec 1, 2016
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Examples relate to identifying algorithmically generated domains. In one example, a computing device may: receive a query domain name; provide the query domain name as input to a predictive model that has been trained to determine whether the query domain name is an algorithmically generated domain name, the determination being based on syntactic features of the query domain name, the syntactic features including a count of particular character n-grams included in at least a portion of the query domain name, where n is a positive integer greater than one; and receive, as output from the predictive model, data indicating whether the query domain name is algorithmically generated.

First claim

Opening claim text (preview).

We claim: 1 . A non-transitory machine-readable storage medium encoded with instructions executable by a hardware processor of a computing device for identifying algorithmically generated domains, the machine-readable storage medium comprising instructions to cause the hardware processor to: receive a first set of domain names, each domain name in the first set being a domain name that was previously identified as valid; receive a second set of domain names, each domain name in the second set being a domain name that was previously identified as algorithmically generated; and train, using the first set and second set, a predictive model to identify a given domain name as one of a valid domain name or an algorithmically generated domain name based on a plurality of syntactic features, the plurality of syntactic features including a count of particular character n-grams included in at least a portion of the query domain name, where n is a positive integer greater than one. 2 . The storage medium of claim 1 , wherein the instructions further cause the hardware processor to: receive a query domain name; provide the query domain name to the predictive model as input; and receive, as output from the predictive model, a prediction specifying that the query domain name is one of i) a valid domain name, or ii) an algorithmically generated domain name. 3 . The storage medium of claim 2 , wherein, responsive to the output specifying that the query domain name is an algorithmically generated domain name, the instructions further cause the hardware processor to: provide a third party computing device with data indicating that the query domain name is algorithmically generated. 4 . The storage medium of claim 2 , wherein each query domain name is received by: receiving a DNS query packet; and extracting, from the DNS query packet, the query domain name. 5 . The storage medium of claim 1 , wherein: each domain name in the second set was previously identified as being generated by one of a plurality of domain name generation algorithms; and identifying the given domain name as an algorithmically generated domain name includes identifying the given domain name as being generated by one of the plurality of domain name generation algorithms. 6 . The storage medium of claim 1 , wherein the instructions further cause the hardware processor to: obtain a third set of domain names, each domain name in the third set being a domain name that was previously identified as being generated by a particular domain name generation algorithm; and train, using the third set, a second predictive model to determine whether a particular domain name was generated by the particular domain name generation algorithm, the determination being based on at least one of the plurality of syntactic features. 7 . A computing device for identifying algorithmically generated domains, the computing device comprising: a hardware processor; and a data storage device storing instructions that, when executed by the hardware processor, cause the hardware processor to: receive a query domain name; provide the query domain name as input to a predictive model that has been trained to determine whether the query domain name is an algorithmically generated domain name, the determination being based on syntactic features of the query domain name, the syntactic features including a count of particular character n-grams included in at least a portion o the query domain name, where n is a positive integer greater than one; and receive, as output from the predictive model, data indicating whether the query domain name is algorithmically generated. 8 . The computing device of claim 7 , wherein the instructions further cause the hardware processor to: provide the query domain name as input to a second predictive model that has been trained to determine whether the query domain name was generated by a particular domain name generation algorithm, the determination being based on at least one of the syntactic features of the query domain name; and receive, as output from the second predictive model, data indicating whether the query domain name was generated by the particular domain name generation algorithm. 9 . The computing device of claim 7 , wherein the data indicating whether the query domain name is algorithmically generated specifies that the query domain name was generated by a particular domain name generation algorithm, 10 . The computing device of claim 7 , wherein the instructions further cause the hardware processor to: provide the query domain name as input to a second predictive model that has been trained to determine which domain name generation algorithm of a plurality of domain name generation algorithms was used to generate the query domain name, the determination being based on at least one of the syntactic features of the query domain name; and receive, as output from the second predictive model, data indicating one of the plurality of domain name generation algorithms. 11 . The computing device of claim 7 , wherein each query domain name is received by: receiving a DNS query packet; and extracting, from the DNS query packet, the query domain name 12 . The computing device of claim 7 , wherein the data indicating whether the query domain name is algorithmically generated specifies a measure of likelihood that the domain name is algorithmically generated, 13 . The computing device of claim 10 , wherein the data indicating one of the plurality of domain name generation algorithms specifies, for each of at least one of the plurality of domain name generation algorithms, a measure of likelihood that the query domain name was generated by the domain name generation algorithm. 14 . A method for identifying algorithmically generated domains, implemented by a hardware processor, the method comprising: receiving, from a client device, a domain name system (DNS) query packet, the DNS query packet including a query domain name; determining whether the query domain name is an algorithmically generated domain name, the determination being based on syntactic features of the query domain name, the syntactic features including a count of particular character n-grams included in at least a portion of the query domain name, where n is a positive integer greater than one; and providing output indicating whether the query domain name is algorithmically generated. 15 . The method of claim 14 , wherein: determining whether the query domain name is an algorithmically generated domain name comprises determining which domain generation algorithm (DGA) of a plurality of DGAs was used to generate the query domain name, and the output indicates which of the plurality of DGAs was used to generate the query domain name.

Assignees

Inventors

Classifications

  • Electricity · mapped topic

  • Electricity · mapped topic

  • Event detection, e.g. attack signature detection · CPC title

  • using domain name system [DNS] · CPC title

  • Detection or countermeasures against botnets · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016352679A1 cover?
Examples relate to identifying algorithmically generated domains. In one example, a computing device may: receive a query domain name; provide the query domain name as input to a predictive model that has been trained to determine whether the query domain name is an algorithmically generated domain name, the determination being based on syntactic features of the query domain name, the syntactic…
Who is the assignee on this patent?
Hewlett Packard Development Co Lp
What technology area does this patent fall under?
Primary CPC classification H04L61/1511. Mapped technology areas include Electricity.
When was this patent published?
Publication date Thu Dec 01 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).