Systems and methods for selecting client backup files for maliciousness analysis

US2024220619A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2024220619-A1
Application numberUS-202218148193-A
CountryUS
Kind codeA1
Filing dateDec 29, 2022
Priority dateDec 29, 2022
Publication dateJul 4, 2024
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed herein are systems and methods for selecting files for malware analysis. In one aspect, a method may include identifying, in a cloud network, a backup of a client machine; extracting, from the backup, at least one file of a given file type; determining whether to include the at least one file in a sandbox of the cloud network by performing a static analysis of the at least one file; selecting the at least one file for inclusion in the sandbox based on the static analysis; monitoring, for a period of time, a behavior of the at least one file in the sandbox by performing a dynamic analysis of the at least one file; and in response to determining that the at least one file is malicious based on the dynamic analysis, performing a remediation action on the at least one file.

First claim

Opening claim text (preview).

1 . A method for selecting files for malware analysis, the method comprising: identifying, in a cloud network, a backup of a client machine; extracting, from the backup, at least one file of a given file type; determining whether to include the at least one file in a sandbox of the cloud network by performing a static analysis of the at least one file, wherein the static analysis comprises determining a likelihood of the at least one file being malicious and comparing the likelihood to a threshold likelihood; in response to determining that the likelihood exceeds the threshold likelihood, selecting the at least one file for inclusion in the sandbox, wherein the sandbox is a software environment that isolates the at least one file from other files in the backup; monitoring, for a period of time, a behavior of the at least one file in the sandbox by performing a dynamic analysis of the at least one file, wherein the dynamic analysis comprises classifying a given file as malicious or non-malicious; and in response to determining that the at least one file is malicious based on the dynamic analysis, performing a remediation action on the at least one file. 2 . The method of claim 1 , wherein the remediation action comprises one or more of: removing the at least one file from the backup; quarantining the at least one file in the backup; removing the at least one file from the client machine; and quarantining the at least one file in the client machine and triggering a threat investigation process. 3 . The method of claim 1 , wherein the given file type is one of a script and an executable, wherein the threshold likelihood is different for each file type, and wherein the given file type is determined using one or more of magic numbers, file name extensions, rules, heuristics, inferences of a trained machine learning algorithm. 4 . The method of claim 1 , wherein determining the likelihood of the at least one file being malicious comprises: determining whether one or more rules of a plurality of rules indicate that the at least one file is malicious; and calculating the likelihood based on a weight of the one or more rules. 5 . The method of claim 4 , wherein the plurality of rules query features present in the at least one file and classify whether the features are associated with potentially-malicious files, wherein the features include whether the at least one file was downloaded from a blacklisted website, whether the at least one file has been modified a threshold number of times in a small period of time, whether the at least one file has changed a system setting, whether a hash value of the at least one file matches a known malware hash. 6 . The method of claim 1 , wherein determining the likelihood of the at least one file being malicious comprises: executing a first machine learning algorithm configured to output a classification of whether the at least one file is to be included in the sandbox and a confidence score of the classification, wherein the confidence score is the likelihood, wherein the first machine learning algorithm is trained to make the classification based on features including opcodes, byte sequences, PE header, and file size of input files. 7 . The method of claim 1 , wherein classifying the given file as malicious or non-malicious comprises executing a malware scanner that compares the at least one file to virus definitions. 8 . The method of claim 1 , wherein classifying the given file as malicious or non-malicious comprises executing a second machine learning algorithm configured to classify the at least one file as malicious or non-malicious, wherein the second machine learning algorithm is trained based on a training dataset comprising features of labelled files in a plurality of sandboxes. 9 . The method of claim 1 , wherein in response to determining that the at least one file is not classified as malicious over the period of time based on the dynamic analysis, removing the at least one file from the sandbox. 10 . The method of claim 1 , wherein in response to determining that the at least one file is not classified as malicious over the period of time based on the dynamic analysis, dissolving the sandbox. 11 . The method of claim 1 , further comprising generating a different sandbox for each backup in a backup archive, wherein the different sandbox is dissolved after a predetermined period of time. 12 . The method of claim 1 , wherein the threshold likelihood is adjusted based on an amount of resources in the cloud network. 13 . The method of claim 1 , wherein the sandbox includes files originating from backups of a plurality of client machines. 14 . A system for selecting files for malware analysis, comprising: a memory; and a hardware processor communicatively coupled with the memory and configured to: identify, in a cloud network, a backup of a client machine; extract, from the backup, at least one file of a given file type; determine whether to include the at least one file in a sandbox of the cloud network by performing a static analysis of the at least one file, wherein the static analysis comprises determining a likelihood of the at least one file being malicious and comparing the likelihood to a threshold likelihood; in response to determining that the likelihood exceeds the threshold likelihood, select the at least one file for inclusion in the sandbox, wherein the sandbox is a software environment that isolates the at least one file from other files in the backup; monitor, for a period of time, a behavior of the at least one file in the sandbox by performing a dynamic analysis of the at least one file, wherein the dynamic analysis comprises classifying a given file as malicious or non-malicious; and in response to determining that the at least one file is malicious based on the dynamic analysis, perform a remediation action on the at least one file. 15 . The system of claim 14 , wherein the remediation action comprises one or more of: removing the at least one file from the backup; quarantining the at least one file in the backup; removing the at least one file from the client machine; and quarantining the at least one file in the client machine and triggering a threat investigation. 16 . The system of claim 14 , wherein the given file type is one of a script and an executable, wherein the threshold likelihood is different for each file type, and wherein the given file type is determined using one or more of magic numbers, file name extensions, rules, heuristics, inferences of a trained machine learning algorithm. 17 . The system of claim 14 , wherein the hardware processor is configured to determine the likelihood of the at least one file being malicious by: determining whether one or more rules of a plurality of rules indicate that the at least one file is malicious; and calculating the likelihood based on a weight of the one or more rules. 18 . The system of claim 17 , wherein the plurality of rules query features present in the at least one file and classify whether the features are associated with potentially-malicious files, wherein the features include whether the at least one file was downloaded from a blacklisted website, whether the at least one file has been modified a threshold number of times in a small period of time, whether the at least one file has changed a system setting, whether a hash value of the at least one file matches a known malware hash. 19 . The system of claim 14 , wherein the hardware processor is configured to d

Assignees

Inventors

Classifications

  • by runtime analysis (performance monitoring G06F11/3466) · CPC title

  • Analysis of software for verifying properties of programs (testing of software G06F11/3668) · CPC title

  • Threshold · CPC title

  • Management of the data involved in backup or backup restore · CPC title

  • Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2024220619A1 cover?
Disclosed herein are systems and methods for selecting files for malware analysis. In one aspect, a method may include identifying, in a cloud network, a backup of a client machine; extracting, from the backup, at least one file of a given file type; determining whether to include the at least one file in a sandbox of the cloud network by performing a static analysis of the at least one file; s…
Who is the assignee on this patent?
Acronis Int Gmbh
What technology area does this patent fall under?
Primary CPC classification G06F21/565. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jul 04 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).