Systems and methods for determining and detecting malware families
US-2024281531-A1 · Aug 22, 2024 · US
US12505207B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12505207-B2 |
| Application number | US-202318399250-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 28, 2023 |
| Priority date | Dec 28, 2023 |
| Publication date | Dec 23, 2025 |
| Grant date | Dec 23, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods implement artificial intelligence to automatically generate malware detection rules. In a first phase, an AI model is trained on large amounts of data, so the AI model can learn to distinguish between benign applications and different malware families. In a second phase, the AI model is queried with new malware samples, and systems and methods propose new malware detection rules for those samples.
Opening claim text (preview).
The invention claimed is: 1 . A computer-implemented method of protecting a system endpoint, the method comprising: logging behavior information of an application executing on an endpoint; generating a sequence of vectors for the behavior information including: creating a process behavior graph depicting all processes of the application, relationships between processes, and behavior of individual processes, and transforming the process behavior graph into a linear sequence of elements, wherein each element represents a node from the process behavior graph; pretraining an AI model on the linear sequence of elements, wherein the AI model is a foundational transformer neural network; tuning the AI model according to a malware family prediction or a benign application; querying the AI model with a new malware sample; proposing a malware detection rule based on the new malware sample, the malware detection rule including the malware family prediction; and backpropagating the malware family prediction through the AI model to adjust the AI model, including by identifying at least one feature of the queried new malware sample that most contributed to the malware family prediction. 2 . The method of claim 1 , wherein the transformer neural network uses an attention mechanism, and wherein backpropagating the malware family prediction includes: determining which of a plurality of input tokens are prioritized according to an attention score; determining most relevant behavior events according to the attention score; and ignoring input tokens with a low attention score. 3 . The method of claim 1 , further comprising: determining a common pattern across the new malware samples to find a sequence of features that are present in all of the new malware samples of a family but are not all present in any other application family. 4 . The method of claim 1 , wherein training the AI model on the linear sequence of elements includes pretraining tasks including at least one of: masked event prediction; masked feature prediction; next sequence prediction; binary classification; or classification into application type. 5 . The method of claim 1 , further comprising: detecting the new malware sample executed on the endpoint; gathering behavior information of the new malware sample; extracting relevant behavior information by the AI model; determining the malware family of the new malware sample; fetching a behavior pattern for the malware family for all samples in a malware family cluster; and determining a malware detection rule that matches the behavior pattern for all samples in the malware family cluster. 6 . The method of claim 5 , wherein determining the malware detection rule that matches the behavior pattern for all samples in the malware family cluster further comprises: splitting the malware family into a first subgroup and a second subgroup; determining a first malware detection rule that matches the behavior pattern for all samples in the first subgroup; and determining a second malware detection rule that matches the behavior pattern for all samples in the second subgroup. 7 . The method of claim 1 , further comprising providing a lightweight malware detection engine with the malware detection rule, wherein the lightweight malware detection engine does not implement an AI module. 8 . The method of claim 7 , further comprising providing the malware detection rule to an analyst user for approval prior to providing the lightweight malware detection engine with the malware detection rule. 9 . The method of claim 1 , wherein querying the AI model with the new malware sample includes extracting a plurality of relevant portions of a behavior pattern of the new malware sample, and wherein proposing the malware detection rule based on the new malware sample includes associating the relevant portions of the behavior pattern to common behavior of the malware family prediction. 10 . A system for protecting a system endpoint, comprising: an endpoint database configured to store behavior information of an application executing on an endpoint; a proposal subsystem including: at least one processor operably coupled to memory; instructions stored in memory that, when executed by the at least one processor, cause the at least one processor to implement: a representation engine configured to: generate a sequence of vectors for the behavior information including: creating a process behavior graph depicting all processes of the application, relationships between processes, and behavior of individual processes, and transforming the process behavior graph into a linear sequence of elements, wherein each element represents a node from the process behavior graph; a training engine configured to: pretrain an AI model on the linear sequence of elements, wherein the AI model is a foundational transformer neural network, and tune the AI model according to a malware family prediction or a benign application, the AI model configured to be queried with a new malware sample; and a rules generation engine configured to: propose a malware detection rule based on the new malware sample, the malware detection rule including the malware family prediction, and backpropagate the malware family prediction through the AI model to adjust the AI model, including by identifying at least one feature of the queried new malware sample that most contributed to the malware family prediction. 11 . The system of claim 10 , wherein the transformer neural network uses an attention mechanism, and wherein the rules generation engine is further configured to backpropagate the malware family prediction including: determining which of a plurality of input tokens are prioritized according to an attention score, determining most relevant behavior events according to the attention score; and ignoring input tokens with a low attention score. 12 . The system of claim 10 , wherein the rules generation engine is further configured to: determine a common pattern across the new malware samples to find a sequence of features that are present in all of the new malware samples of a family but are not all present in any other application family. 13 . The system of claim 10 , wherein the training engine is further configured for pretraining tasks including at least one of: masked event prediction; masked feature prediction; next sequence prediction; binary classification; or classification into application type. 14 . The system of claim 10 , wherein the AI model configured to be queried with the new malware sample applies the AI model to extract relevant behavior information and determine the malware family of the new malware sample, wherein the rules generation engine is further configured to: fetch a behavior pattern for the malware family for all samples in a malware family cluster, and determine a malware detection rule that matches the behavior pattern for all samples in the malware family cluster. 15 . The system of claim 14 , wherein the rules generation engine is further configured to determine the malware detection rule that matches the behavior pattern for all samples in the malware family cluster including: splitting the malware family into a first subgroup and a second subgroup; determining a first malware detection rule that matches the behavior pattern for all samples in the first subgroup; and determining a second malware detection rule that matches the behavior pattern for all samples in the second subgroup. 16 . The system of claim 10
Virus type analysis · CPC title
Test or assess software · CPC title
Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities · CPC title
involving long-term monitoring or reporting · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.