What technology area does this patent fall under?

Primary CPC classification G06F21/566. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Jun 06 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Feature vector aggregation for malware detection

US2019171816A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2019171816-A1
Application number	US-201715832832-A
Country	US
Kind code	A1
Filing date	Dec 6, 2017
Priority date	Dec 6, 2017
Publication date	Jun 6, 2019
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method, apparatus and product performing feature vector aggregation for malware detection. Two sets of measurements produced by a two dynamic analyses of an examined program are obtained, wherein the two dynamic analyses are performed with respect to the examined program executing two different execution paths. An aggregated feature vector representing the examined program is generated. The aggregated feature vector comprises a set of aggregated features, wherein a value of each aggregated feature is based on an aggregation of corresponding measurements in the first set of measurements and in the second set of measurements. A predictive model is applied on the aggregated feature vector to classify the examined program as malicious or benign.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method comprising: obtaining a first set of measurements produced by a first dynamic analysis of an examined program, wherein the first dynamic analysis is performed with respect to the examined program executing a first execution path; obtaining a second set of measurements produced by a second dynamic analysis of the examined program, wherein the second dynamic analysis is performed with respect to the examined program executing a second execution path, wherein the first and second execution paths are different; generating an aggregated feature vector representing the examined program, wherein the aggregated feature vector comprises a set of aggregated features, wherein a value of each aggregated feature is based on an aggregation of corresponding measurements in the first set of measurements and in the second set of measurements; and applying a predictive model on the aggregated feature vector to classify the examined program as malicious or benign. 2 . The method of claim 1 further comprises performing the first dynamic analysis and the second dynamic analysis. 3 . The method of claim 1 further comprises training the predictive model, wherein said training comprises: performing multiple dynamic analysis on each labeled program in a training set, wherein the training set comprises labeled programs, wherein each labeled program is labeled as malicious or benign, wherein labeled programs having a malicious label exhibit malicious functionality in a subset of execution paths thereof; generating for each labeled program the aggregated feature vector, whereby obtaining labeled aggregated feature vectors; and training the predictive model using the labeled aggregated feature vectors. 4 . The method of claim 1 , wherein the set of aggregated features comprises an aggregated Uniform Resource Identifier (URI) feature, wherein the aggregated URI feature comprises a list of all URIs contacted by the examined program in any of the execution paths. 5 . The method of claim 1 , wherein the set of aggregated features comprises an aggregated created file feature, wherein the aggregated created file feature comprises a list of all files created by the examined program in any of the execution paths. 6 . The method of claim 1 , wherein the set of aggregated features comprises an aggregated functionality count feature, wherein the aggregated functionality count feature comprises a count of a number utilizations of a functionality identified in all of the execution paths. 7 . The method of claim 6 , wherein the functionality is an anti-debug pattern. 8 . The method of claim 6 , wherein the functionality is an invocation of an Application Programming Interface (API) function of an Operating System (OS) or of an hardware instruction. 9 . The method of claim 1 , wherein the set of aggregated features comprises an aggregated memory entropy feature, wherein the aggregated memory entropy feature indicates an aggregated entropy of dynamically created memory during execution in all of the execution paths. 10 . The method of claim 1 , wherein the set of aggregated features comprises a maximal string deviation feature, wherein the maximal string deviation feature comprises a maximal deviation of strings that are dynamically created during execution in all of the execution paths. 11 . A computer program product comprising a computer readable storage medium retaining program instructions, which program instructions when read by a processor, cause the processor to perform a method comprising: obtaining a first set of measurements produced by a first dynamic analysis of an examined program, wherein the first dynamic analysis is performed with respect to the examined program executing a first execution path; obtaining a second set of measurements produced by a second dynamic analysis of the examined program, wherein the second dynamic analysis is performed with respect to the examined program executing a second execution path, wherein the first and second execution paths are different; generating an aggregated feature vector representing the examined program, wherein the aggregated feature vector comprises a set of aggregated features, wherein a value of each aggregated feature is based on an aggregation of corresponding measurements in the first set of measurements and in the second set of measurements; and applying a predictive model on the aggregated feature vector to classify the examined program as malicious or benign. 12 . The computer program product of claim 11 , wherein the method further comprises performing the first dynamic analysis and the second dynamic analysis. 13 . The computer program product of claim 11 , wherein the method further comprises training the predictive model, wherein said training comprises: performing multiple dynamic analysis on each labeled program in a training set, wherein the training set comprises labeled programs, wherein each labeled program is labeled as malicious or benign, wherein labeled programs having a malicious label exhibit malicious functionality in a subset of execution paths thereof; generating for each labeled program the aggregated feature vector, whereby obtaining labeled aggregated feature vectors; and training the predictive model using the labeled aggregated feature vectors. 14 . The computer program product of claim 11 , wherein the set of aggregated features comprises an aggregated Uniform Resource Identifier (URI) feature, wherein the aggregated URI feature comprises a list of all URIs contacted by the examined program in any of the execution paths. 15 . The computer program product of claim 11 , wherein the set of aggregated features comprises an aggregated created file feature, wherein the aggregated created file feature comprises a list of all files created by the examined program in any of the execution paths. 16 . The computer program product of claim 11 , wherein the set of aggregated features comprises an aggregated functionality count feature, wherein the aggregated functionality count feature comprises a count of a number utilizations of a functionality identified in all of the execution paths. 17 . The computer program product of claim 16 , wherein the functionality is selected from the group consisting of an anti-debug pattern, an invocation of an Application Programming Interface (API) function of an Operating System (OS) or of an hardware instruction. 18 . The computer program product of claim 11 , wherein the set of aggregated features comprises an aggregated memory entropy feature, wherein the aggregated memory entropy feature indicates an aggregated entropy of dynamically created memory during execution in all of the execution paths. 19 . The computer program product of claim 11 , wherein the set of aggregated features comprises a maximal string deviation feature, wherein the maximal string deviation feature comprises a maximal deviation of strings that are dynamically created during execution in all of the execution paths. 20 . A computerized apparatus having a processor, the processor being adapted to perform the steps of: obtaining a first set of measurements produced by a first dynamic analysis of an examined program, wherein the first dynamic analysis is performed with respect to the examined program executing a first execution path; obtaining a second set of measurements produced by a second dynamic analysis of the examined program, wherein the second dynamic analysis is performed with respect t

Assignees

Inventors

Classifications

G06F21/566Primary
Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities · CPC title
G06F2221/033
Test or assess software · CPC title
G06N20/00
Machine learning · CPC title
G06N5/04
Inference or reasoning models · CPC title
G06N99/005
Physics · mapped topic

Patent family

Related publications grouped by family.

View patent family 66658072

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2019171816A1 cover?: A method, apparatus and product performing feature vector aggregation for malware detection. Two sets of measurements produced by a two dynamic analyses of an examined program are obtained, wherein the two dynamic analyses are performed with respect to the examined program executing two different execution paths. An aggregated feature vector representing the examined program is generated. The a…
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification G06F21/566. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Jun 06 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).