What technology area does this patent fall under?

Primary CPC classification G06N20/00. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Mar 29 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Domain classification

US11288594B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11288594-B2
Application number	US-201815892088-A
Country	US
Kind code	B2
Filing date	Feb 8, 2018
Priority date	Aug 31, 2015
Publication date	Mar 29, 2022
Grant date	Mar 29, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In one example in accordance with the present disclosure, a method for domain classification includes sorting a set of sample domains into leaves based on syntactical features of the domains. Each sample domain belongs to a family of domains. The method also includes identifying, for each leaf, a regular expression for each family with at least one domain in the leaf. The method also includes determining, for each leaf, at least one lobe with a set of domains in the leaf that matches the regular expression for a first family with at least one domain in the leaf, and that does not match the regular expression for the other families with at least one domain in the leaf. The method also includes creating a classifier for the domains in each lobe by using the set of domains from each family in the lobe as training classes for machine learning.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method for domain classification, the method comprising: sorting, by a processor, a set of sample domains into a plurality of leaves based on syntactical features of the sample domains, wherein each sample domain belongs to a family of domains; identifying, for each leaf of the plurality of leaves, a regular expression for each family of domains with at least one domain in the leaf; determining a plurality of lobes for each leaf of the plurality of leaves, at least one lobe of the plurality of lobes having a set of domains in the leaf that matches a regular expression for a first family of domains with at least one domain in the leaf, and that does not match a regular expression for other families of domains with at least one domain in the leaf; creating, by the processor, a classifier for each lobe of the plurality of lobes by using domains from each family of domains in the lobe as training classes for machine learning; receiving network traffic over a computer network; and analyzing the network traffic using a classifier for at least one lobe of the plurality of lobes to identify an algorithmically-generated domain employed by malware of an infected host on the computer network. 2. The method of claim 1 , wherein the syntactical features are defined by a 4-tuple of a top level domain, a length of a first private domain, a length of a prefix and a total number of levels below the top level domain. 3. The method of claim 1 , wherein, for each leaf of the plurality of leaves, a regular expression for each family of domains with at least one domain in the leaf codifies domains within the leaf that are from a particular family of domains. 4. The method of claim 1 further comprising: receiving, by the processor, an unclassified domain from the network traffic; determining, by the processor, a leaf that matches the unclassified domain; determining, by the processor, the at least one lobe that matches the unclassified domain; and applying, by the processor, the classifier for the at least one lobe to the unclassified domain. 5. The method of claim 1 , further comprising: calculating, by the processor, a probability that an unclassified domain from the network traffic belongs to a family of domains used to train the classifier for the at least one lobe. 6. The method of claim 1 , wherein at least one family of sample domains of the set of sample domains is designated as one of a malicious family or a benign family of domains. 7. The method of claim 1 , wherein at least one domain from the network traffic is classified as being benign. 8. The method of claim 1 , further comprising: determining for each leaf of the plurality of leaves, a union and an intersection of regular expressions of families of domains with at least one domain in the leaf. 9. A system for domain classification, the system comprising at least one processor and a memory, the memory storing instructions that when executed by the at least one processor cause the system to: determine a value for each domain in a set of sample domains based on syntactical features of the sample domains; create at least one leaf of domains, wherein all domains in the leaf have a same value; identify, for each leaf, a regular expression for each family of domains containing at least one domain in the leaf; determine, for each leaf, a plurality of lobes, at least one lobe of the plurality of lobes having of possible combinations of the regular expressions and a complement of regular expressions for families of domains compatible with at least one domain in the leaf; and create a classifier for each lobe of the plurality of lobes by using domains from each family of domains in the lobe as training classes for machine learning of the classifier to classify an unclassified domain as an algorithmically-generated domain used by a malware of an infected host on a computer network. 10. The system of claim 9 , wherein each family of sample domains of the set of sample domains has a set of possible values and each leaf consists of domains with values that are possible for the domains in the leaf. 11. The system of claim 9 , wherein the syntactical features are defined by a 4-tuple of a top level domain, a length of a first private domain, a length of a prefix and a total number of levels below the top level domain. 12. A non-transitory machine-readable storage medium comprising instructions executable by a processor of a computing device, the machine-readable storage medium comprising instructions to: sort a set of domains into a plurality of leaves based on syntactical features of the domains; identify each family of domains in each leaf, wherein at least one family of domains defines a set of domain generating algorithms; identify a regular expression for each family of domains in each leaf; determine, for each leaf, a plurality of lobes, at least one lobe of the plurality of lobes having regular expressions and a complement of the regular expressions for families of domains compatible with at least one domain in the leaf; and create a classifier for each lobe of the plurality of lobes by using domains from each family of domains in the lobe as training classes for machine learning of the classifier to classify an unclassified domain as an algorithmically-generated domain used by a malware on an infected host on a computer network. 13. The non-transitory machine-readable storage medium of claim 12 , wherein the syntactical features are defined by a 4-tuple of a top level domain, a length of a first private domain, a length of a prefix and a total number of levels below the top level domain. 14. The non-transitory machine-readable storage medium of claim 12 , further comprising instructions to: receive a test domain; determine a lobe of the plurality of lobes that matches the test domain; and apply a classifier for the determined lobe to the test domain.

Assignees

Trend Micro Inc

Inventors

Classifications

G06N7/01
Probabilistic graphical models, e.g. probabilistic networks · CPC title
H04L2101/604
Address structures or formats · CPC title
H04L63/1408
by monitoring network traffic (monitoring network traffic per se H04L43/00) · CPC title
H04L63/1416
Event detection, e.g. attack signature detection · CPC title
G06N20/00Primary
Machine learning · CPC title

Patent family

Related publications grouped by family.

View patent family 58188737

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11288594B2 cover?: In one example in accordance with the present disclosure, a method for domain classification includes sorting a set of sample domains into leaves based on syntactical features of the domains. Each sample domain belongs to a family of domains. The method also includes identifying, for each leaf, a regular expression for each family with at least one domain in the leaf. The method also includes d…
Who is the assignee on this patent?: Trend Micro Inc
What technology area does this patent fall under?: Primary CPC classification G06N20/00. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Mar 29 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Identification of a dns packet as malicious based on a value

Method of and system for crawling a web resource

Training machine learning models for open-domain question answering system

Frequently asked questions