Domain malware family classification

US2023114721A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2023114721-A1
Application numberUS-202117500018-A
CountryUS
Kind codeA1
Filing dateOct 13, 2021
Priority dateOct 13, 2021
Publication dateApr 13, 2023
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for classifying domains to malware families includes identifying a corpus of malicious domains, identifying one or more suspicious domains, extracting a timeframe corresponding to the one or more suspicious domains, calculating a rank coefficient between the one or more suspicious domains and a current seed domain of the corpus of malicious domains, determining whether the rank correlation coefficient exceeds a rank threshold for the one or more suspicious domains, comparing a number of suspicious domains whose correlation coefficients exceed the rank threshold to a relation threshold, and responsive to determining the number of suspicious domains whose correlation coefficients exceed the rank threshold exceeds the relation threshold, applying a tag to the suspicious domains indicating that the one or more suspicious domains correspond to a same malware family as the current seed domain.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer implemented for classifying domains to malware families, the method comprising: identifying a corpus of malicious domains; identifying one or more suspicious domains; extracting a timeframe corresponding to the one or more suspicious domains; calculating a rank coefficient between the one or more suspicious domains and a current seed domain of the corpus of malicious domains; determining whether the rank correlation coefficient exceeds a rank threshold for the one or more suspicious domains; comparing a number of suspicious domains whose correlation coefficients exceed the rank threshold to a relation threshold; and responsive to determining the number of suspicious domains whose correlation coefficients exceed the rank threshold exceeds the relation threshold, applying a tag to the suspicious domains indicating that the one or more suspicious domains correspond to a same malware family as the current seed domain. 2 . The computer implemented method of claim 1 , further comprising incrementing a counter corresponding to a number of times the rank correlation coefficient for a domain exceeds a rank threshold. 3 . The computer implemented method of claim 2 , wherein comparing a number of suspicious domains whose correlation coefficient exceeds the rank threshold to a relation threshold includes comparing a current count corresponding to the counter to the relation threshold. 4 . The computer implemented method of claim 1 , further comprising constructing one or more feature vectors corresponding to the one or more suspicious domains. 5 . The computer implemented method of claim 4 , further comprising clustering the feature vectors. 6 . The computer implemented method of claim 5 , further comprising determining a distance from a suspicious domain's feature vector to one or more cluster centers corresponding to the clustered feature vectors. 7 . The computer implemented method of claim 6 , further comprising determining a cluster center to which the one or more feature vectors are closest. 8 . A computer program product for classifying domains to malware families, the computer program product comprising: one or more computer readable storage media and program instructions stored on the one or more computer readable storage media, the program instructions comprising instructions to: identify a corpus of malicious domains; identify one or more suspicious domains; extract a timeframe corresponding to the one or more suspicious domains; calculate a rank coefficient between the one or more suspicious domains and a current seed domain of the corpus of malicious domains; determine whether the rank correlation coefficient exceeds a rank threshold for the one or more suspicious domains; compare a number of suspicious domains whose correlation coefficients exceed the rank threshold to a relation threshold; and responsive to determining the number of suspicious domains whose correlation coefficients exceed the rank threshold exceeds the relation threshold, apply a tag to the suspicious domains indicating that the one or more suspicious domains correspond to a same malware family as the current seed domain. 9 . The computer program product of claim 8 , further comprising instructions to increment a counter corresponding to a number of times the rank correlation coefficient for a domain exceeds a rank threshold. 10 . The computer program product of claim 9 , wherein comparing a number of suspicious domains whose correlation coefficient exceeds the rank threshold to a relation threshold includes comparing a current count corresponding to the counter to the relation threshold. 11 . The computer program product of claim 8 , further comprising instructions to construct one or more feature vectors corresponding to the one or more suspicious domains. 12 . The computer program product of claim 11 , further comprising instructions to cluster the one or more feature vectors. 13 . The computer program product of claim 12 , further comprising instructions to determine a distance from a suspicious domain's feature vector to one or more cluster centers corresponding to the clustered feature vectors. 14 . The computer program product of claim 13 , further comprising instructions to determine a cluster center to which the one or more feature vectors are closest. 15 . A computer system for, the computer system comprising: one or more computer processors; one or more computer-readable storage media; program instructions stored on the computer-readable storage media for execution by at least one of the one or more processors, the program instructions comprising instructions to: identify a corpus of malicious domains; identify one or more suspicious domains; extract a timeframe corresponding to the one or more suspicious domains; calculate a rank coefficient between the one or more suspicious domains and a current seed domain of the corpus of malicious domains; determine whether the rank correlation coefficient exceeds a rank threshold for the one or more suspicious domains; compare a number of suspicious domains whose correlation coefficients exceed the rank threshold to a relation threshold; and responsive to determining the number of suspicious domains whose correlation coefficients exceed the rank threshold exceeds the relation threshold, apply a tag to the suspicious domains indicating that the one or more suspicious domains correspond to a same malware family as the current seed domain. 16 . The computer system of claim 15 , further comprising instructions to increment a counter corresponding to a number of times the rank correlation coefficient for a domain exceeds a rank threshold. 17 . The computer system of claim 15 , further comprising instructions to construct one or more feature vectors corresponding to the one or more suspicious domains. 18 . The computer system of claim 17 , further comprising instructions to cluster the one or more feature vectors. 19 . The computer system of claim 18 , further comprising instructions to determine a distance from a suspicious domain's feature vector to one or more cluster centers corresponding to the clustered feature vectors. 20 . The computer system of claim 19 , further comprising instructions to determine a cluster center to which the one or more feature vectors are closest.

Assignees

Inventors

Classifications

  • Traffic logging, e.g. anomaly detection · CPC title

  • Clustering techniques · CPC title

  • for managing network security; network security policies in general (filtering policies H04L63/0227) · CPC title

  • the attack involving the propagation of malware through the network, e.g. viruses, trojans or worms · CPC title

  • Event detection, e.g. attack signature detection · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2023114721A1 cover?
A method for classifying domains to malware families includes identifying a corpus of malicious domains, identifying one or more suspicious domains, extracting a timeframe corresponding to the one or more suspicious domains, calculating a rank coefficient between the one or more suspicious domains and a current seed domain of the corpus of malicious domains, determining whether the rank correla…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification H04L63/1425. Mapped technology areas include Electricity.
When was this patent published?
Publication date Thu Apr 13 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).