What technology area does this patent fall under?

Primary CPC classification H04L63/1416. Mapped technology areas include Electricity.

When was this patent published?

Publication date Tue Apr 28 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Container file analysis using machine learning model

US10637874B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10637874-B2
Application number	US-201615345444-A
Country	US
Kind code	B2
Filing date	Nov 7, 2016
Priority date	Sep 1, 2016
Publication date	Apr 28, 2020
Grant date	Apr 28, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In one respect, there is provided a system for training a machine learning model to detect malicious container files. The system may include at least one processor and at least one memory. The memory may include program code which when executed by the at least one processor provides operations including: processing a container file with a trained machine learning model, wherein the trained machine learning is trained to determine a classification for the container file indicative of whether the container file includes at least one file rendering the container file malicious; and providing, as an output by the trained machine learning model, an indication of whether the container file includes the at least one file rendering the container file malicious. Related methods and articles of manufacture, including computer program products, are also disclosed.

First claim

Opening claim text (preview).

What is claimed is: 1. A system, comprising: at least one processor; and at least one memory including program code which when executed by the at least one processor provides operations comprising: extracting features from each of a plurality of files in a container file; generating, for each file, a feature vector comprising the corresponding extracted features; processing, using the feature vectors, the container file with a trained machine learning model, wherein the trained machine learning model is trained to determine a classification for the container file indicative of whether the container file includes at least one file rendering the container file malicious; and providing, as an output by the trained machine learning model, an indication of whether the container file includes the at least one file rendering the container file malicious; wherein the trained machine learning model is a convolutional neural network that comprises: at least one convolutional layer (i) concurrently processing the plurality of feature vectors in groups of two or more overlapping feature vectors where each group may include at least one feature vector that is included in one or more other groups and (ii) generate a feature map for each group by at least applying at least one kernel to each group; and a pooling layer configured to apply a maximum pooling function to the feature maps, and wherein applying the maximum pooling function identifies a plurality of maximum features from the plurality of feature maps and the classification is based on such maximum features; wherein: features from each file within a container file used to train the machine learning model are concatenated to form an extended feature space for use during the training; the extended feature space prevents misclassification by the trained machine learning model for different container files storing identical or similar sets of files in a different order; and the features are selected from a group consisting of: file name, file path or location, size, creator, owner, or embedded Universal Resource Locator (URL). 2. The system of claim 1 , wherein the at least one file rendering the container file malicious comprises a malicious file. 3. The system of claim 2 , wherein the malicious file comprises unwanted data, an unwanted portion of a script, and/or an unwanted portion of program code. 4. The system of claim 1 , wherein the at least one file rendering the container file malicious comprises a benign file rendering the container file malicious when combined with another benign file from the container file. 5. The system of claim 1 , wherein applying the at least one kernel includes computing a dot product between features included in each kernel and features included in a first overlapping group of feature vectors to generate a first entry in the corresponding feature map, and computing another dot product between features included in each kernel and features included in a second overlapping group of feature vectors to generate a second entry in such corresponding feature map. 6. A method for implementation by one or more data processors forming part of at least one computing device, the method comprising: extracting features from each of a plurality of files in a container file; generating, for each file, a feature vector comprising the corresponding extracted features; processing, using the feature vectors, the container file with a trained machine learning model, wherein the trained machine learning model is configured to determine a classification qfor the container file indicative of whether the container file includes a plurality of files and at least one file rendering the container file malicious, the processing comprising concatenating features extracted from each of the plurality of files in the container file into a feature space for input into the trained machine learning model; and providing, as an output, an indication of whether the container file includes the at least one file rendering the container file malicious; wherein the trained machine learning model is a convolutional neural network that comprises: at least one convolutional layer (i) concurrently processing the plurality of feature vectors in groups of two or more overlapping feature vectors where each group may include at least one feature vector that is included in one or more other groups and (ii) generate a feature map for each group by at least applying at least one kernel to each group; and a pooling layer configured to apply a maximum pooling function to the feature maps, and wherein applying the maximum pooling function identifies a plurality of maximum features from the plurality of feature maps and the classification is based on such maximum features; wherein: features from each file within a container file used to train the machine learning model are concatenated to form an extended feature space for use during the training; the extended feature space prevents misclassification by the trained machine learning model for different container files storing identical or similar sets of files in a different order; and the features are selected from a group consisting of: file name, file path or location, size, creator, owner, or embedded Universal Resource Locator (URL). 7. The method of claim 6 , wherein the at least one file rendering the container file malicious comprises a malicious file. 8. The method of claim 6 , wherein the at least one file rendering the container file malicious comprises a benign file rendering the container file malicious when combined with another benign file from the container file. 9. The method of claim 6 , wherein applying the at least one kernel includes computing a dot product between features included in each kernel and features included in a first overlapping group of feature vectors to generate a first entry in the corresponding feature map, and computing another dot product between features included in each kernel and features included in a second overlapping group of feature vectors to generate a second entry in such corresponding feature map. 10. The method of claim 6 , wherein the malicious file comprises unwanted data, an unwanted portion of a script, and/or an unwanted portion of program code. 11. A non-transitory computer-readable storage medium including program code which when executed by at least one processor causes operations comprising: extracting features from each of a plurality of files in a container file; generating, for each file, a feature vector comprising the corresponding extracted features; processing, using the feature vectors, the container file with a trained machine learning model, wherein the trained machine learning model is configured to determine a classification for the container file indicative of whether the container file includes a plurality of files and at least one file rendering the container file malicious, the processing comprising concatenating features extracted from each of the plurality of files in the container file into a feature space for input into the trained machine learning model; and providing, as an output, an indication of whether the container file includes the at least one file rendering the container file malicious; wherein the trained machine learning model is a convolutional neural network that comprises: at least one convolutional layer (i) concurrently processing the plurality of feature vectors in groups of two or more overlapping feature vectors where each group may include at least one feature vector that is included in one or more other groups and (ii) generate a feature map for each group by at least applying at least one kernel

Assignees

Cylance Inc

Inventors

Classifications

G06F21/562
Static detection · CPC title
G06N20/10
using kernel methods, e.g. support vector machines [SVM] · CPC title
G06N3/08
Learning methods · CPC title
H04L63/1416Primary
Event detection, e.g. attack signature detection · CPC title
G06N3/0454
Physics · mapped topic

Patent family

Related publications grouped by family.

View patent family 61243849

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10637874B2 cover?: In one respect, there is provided a system for training a machine learning model to detect malicious container files. The system may include at least one processor and at least one memory. The memory may include program code which when executed by the at least one processor provides operations including: processing a container file with a trained machine learning model, wherein the trained mach…
Who is the assignee on this patent?: Cylance Inc
What technology area does this patent fall under?: Primary CPC classification H04L63/1416. Mapped technology areas include Electricity.
When was this patent published?: Publication date Tue Apr 28 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Determining duplicate objects for malware analysis using environmental/context information

Retinal image quality assessment, error identification and automatic quality correction

Low- and high-fidelity classifiers applied to road-scene images

Confidence level threshold selection assistance for a data loss prevention system using machine learning

Malware detection

Automatic threat detection of executable files based on static data analysis

System and method for automated machine-learning, zero-day malware detection

Frequently asked questions