What technology area does this patent fall under?

Primary CPC classification G06F21/566. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Nov 21 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Machine learning-based malware detection system and method

US11822657B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11822657-B2
Application number	US-202217724744-A
Country	US
Kind code	B2
Filing date	Apr 20, 2022
Priority date	Apr 7, 2017
Publication date	Nov 21, 2023
Grant date	Nov 21, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Disclosed is a computer implemented method for malware detection that analyses a file on a per packet basis. The method receives a packet of one or more packets associated a file, and converting a binary content associated with the packet into a digital representation and tokenizing plain text content associated with the packet. The method extracts one or more n-gram features, an entropy feature, and a domain feature from the converted content of the packet and applies a trained machine learning model to the one or more features extracted from the packet. The output of the machine learning method is a probability of maliciousness associated with the received packet. If the probability of maliciousness is above a threshold value, the method determines that the file associated with the received packet is malicious.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer program product embodied in a non-transitory computer readable storage medium and comprising computer instructions that when executed by a processor cause the processor to perform steps of: receiving a first packet of one or more packets associated with a file; determining a file type of the file from the first packet; converting contents of the first packet into a corresponding digital representation for feature extraction; extracting one or more features from the corresponding digital representation; and applying a trained machine learning model to the one or more features to determine probability of maliciousness. 2. The computer program product embodied in a non-transitory computer readable storage medium of claim 1 , wherein the steps further include labeling the file based on the probability of maliciousness. 3. The computer program product embodied in a non-transitory computer readable storage medium of claim 1 , wherein the steps further include responsive to the first packet having the probability of maliciousness as benign, receiving a next packet of the one or more packets; and performing the converting, extracting, and applying on the next packet. 4. The computer program product embodied in a non-transitory computer readable storage medium of claim 3 , wherein the steps further include continuing with additional packets of the one or more packets until a determination of whether the file is malicious or benign. 5. The computer program product embodied in a non-transitory computer readable storage medium of claim 1 , wherein the file type is one of a portable executable (PE) file, a portable document format (PDF) file, a Dynamic Loaded Library (DLL), a JavaScript (JS) file, a Hypertext Markup Language (HTML) file, and a Microsoft Office File. 6. The computer program product embodied in a non-transitory computer readable storage medium of claim 1 , wherein the digital representation is any of a decimal representation, a binary representation, a hexadecimal representation, a tokenized script, and a tokenized domain. 7. The computer program product embodied in a non-transitory computer readable storage medium of claim 1 , wherein the trained machine learning model comprises one or more decision trees. 8. The computer program product embodied in a non-transitory computer readable storage medium of claim 1 , wherein the one or more features include n-gram features. 9. The computer program product embodied in a non-transitory computer readable storage medium of claim 1 , wherein the one or more features include an entropy feature. 10. The computer program product embodied in a non-transitory computer readable storage medium of claim 1 , wherein the one or more features include a domain feature. 11. A method comprising steps of: receiving a first packet of one or more packets associated with a file; determining a file type of the file from the first packet; converting contents of the first packet into a corresponding digital representation for feature extraction; extracting one or more features from the corresponding digital representation; and applying a trained machine learning model to the one or more features to determine probability of maliciousness. 12. The method of claim 11 , wherein the steps further include labeling the file based on the probability of maliciousness. 13. The method of claim 11 , wherein the steps further include responsive to the first packet having the probability of maliciousness as benign, receiving a next packet of the one or more packets; and performing the converting, extracting, and applying on the next packet. 14. The method of claim 13 , wherein the steps further include continuing with additional packets of the one or more packets until a determination of whether the file is malicious or benign. 15. The method of claim 11 , wherein the file type is one of a portable executable (PE) file, a portable document format (PDF) file, a Dynamic Loaded Library (DLL), a JavaScript (JS) file, a Hypertext Markup Language (HTML) file, and a Microsoft Office File. 16. The method of claim 11 , wherein the digital representation is any of a decimal representation, a binary representation, a hexadecimal representation, a tokenized script, and a tokenized domain. 17. The method of claim 11 , wherein the trained machine learning model comprises one or more decision trees. 18. The method of claim 11 , wherein the one or more features include n-gram features. 19. The method of claim 11 , wherein the one or more features include an entropy feature. 20. The method of claim 11 , wherein the one or more features include a domain feature.

Assignees

Zscaler Inc

Inventors

Classifications

G06F21/566Primary
Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities · CPC title
G06N5/01
Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title
G06N20/00
Machine learning · CPC title
G06N20/20
Ensemble learning · CPC title
G06F2221/034
Test or assess a computer or a system · CPC title

Patent family

Related publications grouped by family.

View patent family 63711675

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11822657B2 cover?: Disclosed is a computer implemented method for malware detection that analyses a file on a per packet basis. The method receives a packet of one or more packets associated a file, and converting a binary content associated with the packet into a digital representation and tokenizing plain text content associated with the packet. The method extracts one or more n-gram features, an entropy featur…
Who is the assignee on this patent?: Zscaler Inc
What technology area does this patent fall under?: Primary CPC classification G06F21/566. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Nov 21 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Time-based flexible packet scheduling

Method and apparatus for predictive classification of actionable network alerts

Modular model workflow in a distributed computation system

Systems and methods for dynamic cloud-based malware behavior analysis

Frequently asked questions