Heidi: ML on hypervisor dynamic analysis data for malware classification

US12561434B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12561434-B2
Application numberUS-202217715572-A
CountryUS
Kind codeB2
Filing dateApr 7, 2022
Priority dateApr 7, 2022
Publication dateFeb 24, 2026
Grant dateFeb 24, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present application discloses a method, system, and computer system for detecting malicious files. The method includes executing a sample in a virtual environment, and determining whether the sample is malware based at least in part on memory-use artifacts obtained in connection with execution of the sample in the virtual environment.

First claim

Opening claim text (preview).

What is claimed is: 1 . A system, comprising: one or more processors configured to: execute a sample in a virtual environment; determine whether the sample is malicious based at least in part on memory-use artifacts obtained in connection with execution of the sample in the virtual environment, wherein: determining whether the sample is malicious is based at least in part on a machine learning classifier; the machine learning classifier uses a feature of application programming interface (API) vectors; the API vectors are determined based at least in part on determining a set of API pointers for which the sample allocates memory, and determining based on the set of API pointers a set of contiguous API vectors; and the feature of API vectors is based at least in part on a contiguous list of pointers in memory; and a memory coupled to the one or more processors and configured to provide one or more processors with instructions. 2 . The system of claim 1 , wherein the machine learning classifier uses a combined feature vector to detect malware. 3 . The system of claim 2 , wherein the combined feature vector is based at least in part on one or more of (i) a feature vector of application programming interface (API) pointers, (ii) a feature vector of Operating System (OS) structure modifications, (iii) a feature vector of page permissions modifications, and (iv) a feature of API vectors. 4 . The system of claim 3 , wherein the combined feature vector is a concatenation of (i) the feature vector of application programming interface (API) pointers, (ii) the feature vector of Operating System (OS) structure modifications, (iii) the feature vector of page permissions modifications, and (iv) the feature of API vectors. 5 . The system of claim 1 , wherein the machine learning classifier uses a feature vector of application programming interface (API) pointers to detect malware. 6 . The system of claim 1 , wherein the machine learning classifier uses a feature vector of Operating System (OS) structure modifications to detect malware. 7 . The system of claim 1 , wherein the machine learning classifier uses a feature vector of page permissions modifications to detect malware. 8 . The system of claim 1 , wherein the machine learning model is based on an XGBoost machine learning classifier model. 9 . The system of claim 1 , wherein the one or more processors are further configured to monitor a behavior of the sample during execution of the sample in the virtual environment. 10 . The system of claim 9 , wherein to monitor the behavior of the sample includes monitoring modification made in the virtual environment during execution. 11 . The system of claim 9 , wherein to monitor the behavior of the sample comprises a dynamic analysis of an execution of the sample. 12 . The system of claim 1 , wherein the one or more processors are further configured to receive the sample. 13 . The system of claim 1 , wherein the one or more processors are further configured to: send, to a security entity, an indication that the sample is malicious. 14 . The system of claim 1 , wherein the one or more processors are further configured to: enforce one or more security policies based on a determination of whether the sample is malicious. 15 . The system of claim 1 , wherein the sample performs one or more anti-emulation or dynamic analysis evasion techniques. 16 . The system of claim 1 , wherein the one or more processors are further configured to: in response to determining the sample is malicious, update a blacklist of SQL or command injection strings that are deemed to be malicious, the blacklist of SQL or command injection strings being updated to include an identifier corresponding to the sample. 17 . The system of claim 1 , wherein the feature of API vectors comprises information for a plurality of API vectors. 18 . The system of claim 1 , wherein the feature of API vectors comprises information pertaining to a plurality of API vectors within a predefined proximity of each other in the memory. 19 . The system of claim 1 , wherein the feature of API vectors is determined based at least in part on a set of tuples for the APIs for which memory is allocated during execution of the sample, and each tuple comprises information pertaining to an API pointer and information pertaining to a memory location for the API pointer. 20 . A method, comprising: executing a sample in a virtual environment; and determining whether the sample is malicious based at least in part on memory-use artifacts obtained in connection with execution of the sample in the virtual environment, wherein: determining whether the sample is malicious is based at least in part on a machine learning classifier; the machine learning classifier uses a feature of application programming interface (API) vectors; the API vectors are determined based at least in part on determining a set of API pointers for which the sample allocates memory, and determining based on the set of API pointers a set of contiguous API vectors; and the feature of API vectors is based at least in part on a contiguous list of pointers in memory. 21 . A computer program product embodied in a non-transitory computer readable medium and comprising computer instructions for: executing a sample in a virtual environment; and determining whether the sample is malicious based at least in part on memory-use artifacts obtained in connection with execution of the sample in the virtual environment, wherein: determining whether the sample is malicious is based at least in part on a machine learning classifier; the machine learning classifier uses a feature of application programming interface (API) vectors; the API vectors are determined based at least in part on determining a set of API pointers for which the sample allocates memory, and determining based on the set of API pointers a set of contiguous API vectors; and the feature of API vectors is based at least in part on a contiguous list of pointers in memory. 22 . A system, comprising: one or more processors configured to: execute a sample in a virtual environment; determine whether the sample is malicious based at least in part on memory-use artifacts obtained in connection with execution of the sample in the virtual environment, wherein: determining whether the sample is malicious is based at least in part on a machine learning classifier; the machine learning classifier uses a feature of application programming interface (API) vectors; the feature of API vectors is based at least in part on a contiguous list of pointers in memory; and the feature of API vectors comprises information pertaining to a plurality of API vectors within a predefined proximity of each other in the memory; and a memory coupled to the one or more processors and configured to provide one or more processors with instructions. 23 . A method, comprising: executing a sample in a virtual environment; and determining whether the sample is malicious based at least in part on memory-use artifacts obtained in connection with execution of the sample in the virtual environment, wherein: determining whether the sample is malicious is based at least in part on a machine learning classifier; the machine learning classifier uses a feature of application programming interface (API) vectors; the feature of API vectors is based at least in part on a contiguous lis

Assignees

Inventors

Classifications

  • by executing in a restricted environment, e.g. sandbox or secure virtual machine · CPC title

  • Machine learning · CPC title

  • Test or assess software · CPC title

  • Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title

  • Ensemble learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12561434B2 cover?
The present application discloses a method, system, and computer system for detecting malicious files. The method includes executing a sample in a virtual environment, and determining whether the sample is malware based at least in part on memory-use artifacts obtained in connection with execution of the sample in the virtual environment.
Who is the assignee on this patent?
Palo Alto Networks Inc
What technology area does this patent fall under?
Primary CPC classification G06F21/566. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 24 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).