Detecting Microsoft Windows installer malware using text classification models

US12197574B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12197574-B2
Application numberUS-202117550420-A
CountryUS
Kind codeB2
Filing dateDec 14, 2021
Priority dateDec 14, 2021
Publication dateJan 14, 2025
Grant dateJan 14, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present application discloses a method, system, and computer system for detecting malicious files. The method includes receiving a sample, extracting an embedded script from the sample, applying a malicious script detector in connection with determining whether the sample is malicious, and in response to determining that the sample is malicious sending, to a security entity, an indication that the sample is malicious.

First claim

Opening claim text (preview).

What is claimed is: 1. A system, comprising: one or more processors configured to: receive a sample, wherein the sample is a Microsoft Windows Portable Executable (PE) file; extract an embedded script from the sample, wherein the embedded script is an installer script, wherein the installer script is extracted in a sandboxed environment; determine one or more features based at least in part on the installer script; apply a malicious script detector in connection with determining whether the sample is malicious, wherein the malicious script detector determines a sample classification based at least in part on querying a machine learning model based at least in part on the one or more features; and in response to determining that the sample is malicious, send, to a security entity, an indication that the sample is malicious; and a memory coupled to the one or more processors and configured to provide the one or more processors with instructions. 2. The system of claim 1 , wherein extracting the embedded script comprises: extracting the installer script from the sample; and decompiling the installer script to obtain code corresponding to the installer script. 3. The system of claim 1 , wherein applying malicious script detector in connection with determining whether the embedded script is malicious comprises: applying the machine learning model to determine whether the embedded script is malicious. 4. The system of claim 3 , wherein the machine learning model is trained based at least in part on one or more code samples that have been deemed to be malicious. 5. The system of claim 2 , wherein the applying the machine learning model to determine whether the embedded script is malicious comprises: analyzing code corresponding to the embedded script to determine whether the code comprises one or more elements that are indicative of malicious code. 6. The system of claim 5 , wherein the machine learning model is trained to learn the one or more elements that are indicative of malicious code. 7. The system of claim 5 , wherein analyzing code corresponding to the embedded script to determine whether the code comprises one or more elements that are indicative of malicious code comprises: applying a text classification machine learning model prediction in connection with detecting, in the code, the one or more elements that are indicative of malicious code. 8. The system of claim 1 , wherein the machine learning model is trained based at least in part on one or more attributes associated with code samples previously deemed to be malicious. 9. The system of claim 1 , wherein: applying the malicious script detector in connection with determining whether the sample is malicious comprises: determining a likelihood that code corresponding to the embedded script is malicious; and the sample is deemed to be malicious in response to a determination that the likelihood that the code corresponding to the embedded script is malicious is greater than a likelihood threshold value. 10. The system of claim 9 , wherein the likelihood that the code corresponding to the embedded script is malicious is determined based at least in part on a degree of similarity between the code and one or more other malicious code samples. 11. The system of claim 1 , wherein the security entity enforces one or more security policies with respect to the sample based at least in part on the indication that the sample is malicious. 12. The system of claim 11 , wherein the one or more security policies are configured based at least in part on a customer setting. 13. The system of claim 11 , wherein the security entity blocks traffic comprising the sample in response to receiving the indication that the sample is malicious. 14. The system of claim 1 , wherein the security entity is a firewall. 15. The system of claim 1 , wherein a signature or file hash corresponding to the sample is sent to the security entity in connection with sending the indication that the sample is malicious. 16. The system of claim 1 , wherein a result of the malicious script detector is used as a factor with one or more other factors to determine whether the sample is malicious. 17. The system of claim 1 , wherein one or more factors used in connection with malicious script detector determining that the code corresponding to the embedded script is malicious comprises: a call to an executable file or to a cryptocurrency wallet. 18. A method, comprising: receiving, by one or more processors, a sample, wherein the sample is a Microsoft Windows Portable Executable (PE) file; extracting an embedded script from the sample, wherein the embedded script is an installer script, wherein the installer script is extracted in a sandboxed environment; determining one or more features based at least in part on the installer script; applying a malicious script detector in connection with determining whether the sample is malicious, wherein the malicious script detector determines a sample classification based at least in part on querying a machine learning model based at least in part on the one or more features; and in response to determining that the sample is malicious, sending, to a security entity, an indication that the sample is malicious. 19. A computer program product embodied in a non-transitory computer readable medium and comprising computer instructions for: receiving, by one or more processors, a sample, wherein the sample is a Microsoft Windows Portable Executable (PE) file; extracting an embedded script from the sample, wherein the embedded script is an installer script, wherein the installer script is extracted in a sandboxed environment; determining one or more features based at least in part on the installer script; applying a malicious script detector in connection with determining whether the sample is malicious, wherein the malicious script detector determines a sample classification based at least in part on querying a machine learning model based at least in part on the one or more features; and in response to determining that the sample is malicious, sending, to a security entity, an indication that the sample is malicious. 20. The system of claim 1 , wherein applying malicious script detector in connection with determining whether the embedded script is malicious comprises: using the machine learning model to perform text classification on the installer script; and determining whether the sample is malicious based at least in part on the text classification. 21. The system of claim 1 , wherein the one or more processors are further configured to determine that the sample is malicious based at least in part on one or more of (i) an executable called in the extracted installer script, (ii) a cryptocurrency wallet called in the extracted installer script, and (iii) an alphanumeric string comprised in the extracted installer script. 22. The system of claim 1 , wherein the one or more processors are further configured to determine that the sample is malicious based at least in part on the installer script and a PE file structure for the sample.

Assignees

Inventors

Classifications

  • by virus signature recognition · CPC title

  • G06F21/54Primary

    by adding security routines or objects to programs · CPC title

  • involving event detection and direct action · CPC title

  • G06F21/566Primary

    Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12197574B2 cover?
The present application discloses a method, system, and computer system for detecting malicious files. The method includes receiving a sample, extracting an embedded script from the sample, applying a malicious script detector in connection with determining whether the sample is malicious, and in response to determining that the sample is malicious sending, to a security entity, an indication t…
Who is the assignee on this patent?
Palo Alto Networks Inc
What technology area does this patent fall under?
Primary CPC classification G06F21/54. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 14 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).