Detecting malicious code in sections of computer files

US10169581B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10169581-B2
Application numberUS-201615249702-A
CountryUS
Kind codeB2
Filing dateAug 29, 2016
Priority dateAug 29, 2016
Publication dateJan 1, 2019
Grant dateJan 1, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A training data set for training a machine learning module is prepared by dividing normal files and malicious files into sections. Each section of a normal file is labeled as normal. Each section of a malicious file is labeled as malicious regardless of whether or not the section is malicious. The sections of the normal files and malicious files are used to train the machine learning module. The trained machine learning module is packaged as a machine learning model, which is provided to an endpoint computer. In the endpoint computer, an unknown file is divided into sections, which are input to the machine learning model to identify a malicious section of the unknown file, if any is present in the unknown file.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method of evaluating a file for malicious code, the method comprising: receiving a plurality of normal files and a plurality of malicious files; dividing each of the normal files and each of the malicious files into a plurality of file sections; labeling each file section of the normal files as a normal file section; labeling each file section of the malicious files as a malicious file section; generating a machine learning model using a machine learning training data set comprising the labeled file sections of the normal files and the malicious files; and using the machine learning model to identify which particular section of a target file contains malicious code. 2. The computer-implemented method of claim 1 , wherein using the machine learning model to identify which particular section of the target fie contains malicious code comprises: dividing the target file into a plurality of sections; and using the machine learning model to classify each of the sections of the target file. 3. The computer-implemented method of claim 1 , wherein the machine learning model is generated by training a Support Vector Machine using the training data set. 4. The computer-implemented method of claim 1 , further comprising: providing the machine learning model to an endpoint computer system over a computer network, wherein the endpoint computer system receives the target file over the computer network and classifies individual sections of the target file using the machine learning model. 5. The computer-implemented method of claim 1 , wherein the normal files, the malicious files, and the target file are executable files. 6. The computer-implemented method of claim 1 , wherein the normal files, the malicious files, and the target file are in Portable Executable format. 7. A system for evaluating files for malicious code, the system comprising: a backend computer system that is configured to divide each of a plurality of normal files into file sections, divide each of a plurality of malicious files into file sections, label each file section of the normal files as a normal file section, label each file section of the malicious files as a malicious file section, and generate a machine learning model using a machine learning training data set comprising labeled file sections of the normal files and the malicious files; and an endpoint computer that is configured to receive the machine learning model over a computer network, receive a target file, and use the machine learning model to identify which particular section of the target file contains malicious code. 8. The system of claim 7 , wherein the endpoint computer divides the target file into a plurality of sections and inputs the sections of the target file into the machine learning model. 9. The system of claim 7 , wherein the backend computer system generates the machine learning model by training a Support Vector Machine using the training data set. 10. The system of claim 7 , wherein the normal files, the malicious files, and the target file are executable files. 11. The system of claim 7 , wherein the normal files, the malicious files, and the target file are in Portable Executable format. 12. The system of claim 7 , wherein the endpoint computer divides the target file into a plurality of sections and inputs the sections of the target file into the machine learning model. 13. A non-transitory computer-readable medium comprising instructions stored thereon, that when executed by a processor, perform the steps of: dividing each of a plurality of normal files and each of a plurality of malicious files into a plurality of file sections; labeling each file section of the normal files as a normal file section; labeling each file section of the malicious files as a malicious file section; generating a machine learning model using a machine learning training data set comprising labeled file sections of the normal files and the malicious files; and providing the machine learning model to an endpoint computer system to detect malicious files in the endpoint computer system. 14. The non-transitory computer-readable medium of claim 13 , wherein the machine learning model is generated by training a Support Vector Machine using the training data set. 15. The non-transitory computer-readable medium of claim 13 , wherein the normal files and the malicious files are executable files. 16. The non-transitory computer-readable medium of claim 13 , wherein the normal files and the malicious files are in Portable Executable format. 17. The non-transitory computer-readable medium of claim 13 , wherein the machine learning model is provided to the endpoint computer system over the Internet.

Assignees

Inventors

Classifications

  • Physics · mapped topic

  • Static detection · CPC title

  • Test or assess a computer or a system · CPC title

  • G06F21/56Primary

    Computer malware detection or handling, e.g. anti-virus arrangements · CPC title

  • using kernel methods, e.g. support vector machines [SVM] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10169581B2 cover?
A training data set for training a machine learning module is prepared by dividing normal files and malicious files into sections. Each section of a normal file is labeled as normal. Each section of a malicious file is labeled as malicious regardless of whether or not the section is malicious. The sections of the normal files and malicious files are used to train the machine learning module. Th…
Who is the assignee on this patent?
Trend Micro Inc
What technology area does this patent fall under?
Primary CPC classification G06F21/56. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 01 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).