Multi-channel change-point malware detection

US9853997B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9853997-B2
Application numberUS-201514686420-A
CountryUS
Kind codeB2
Filing dateApr 14, 2015
Priority dateApr 14, 2014
Publication dateDec 26, 2017
Grant dateDec 26, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A malware detection system and method detects changes in host behavior indicative of malware execution. The system uses linear discriminant analysis (LDA) for feature extraction, multi-channel change-point detection algorithms to infer malware execution, and a data fusion center (DFC) to combine local decisions into a host-wide diagnosis. The malware detection system includes sensors that monitor the status of a host computer being monitored for malware, a feature extractor that extracts data from the sensors corresponding to predetermined features, local detectors that perform malware detection on each stream of feature data from the feature extractor independently, and a data fusion center that uses the decisions from the local detectors to infer whether the host computer is infected by malware.

First claim

Opening claim text (preview).

What is claimed: 1. A malware detection system comprising: sensors that monitor the status of a host computer being monitored for malware, including malware that do not propagate through a network; a feature extractor including a processor and a memory that stores instructions that when executed by the processor causes the processor to compute streams of feature data from raw sensor data from the sensors corresponding to predetermined features; local detectors that perform malware detection on each stream of feature data from the feature extractor independently; and a data fusion center that applies a k out of N fusion rule, where k is a threshold of a number of positive detections and N is a total number of decisions reported by the local detectors, where N=2M and M is the number of local detectors, to the malware detection decisions from the local detectors to infer whether the host computer is infected by malware. 2. The malware detection system of claim 1 , wherein each sensor monitors a distinct operating phenomenon of the host computer and reports its raw sensor data once per sampling period. 3. The malware detection system of claim 1 , wherein the raw sensor data are processed by the feature extractor to transform the raw sensor data into a set of features to use for detection of malware. 4. The malware detection system of claim 1 , wherein the local detectors each sequentially monitors a single stream of feature data from the feature extractor and detects whether a change has occurred in a distribution of the feature data. 5. The malware detection system of claim 4 , wherein the local detectors provide a new decision every sampling period regarding whether a change has occurred in the distribution of the feature data. 6. The malware detection system of claim 4 , wherein the local detectors each perform an implementation of a change-point detection technique comprising a Page's cumulative sum (CUSUM) test. 7. The malware detection system of claim 6 , wherein the CUSUM test is implemented as a repeated cumulative log-likelihood ratio test with an adaptive detection threshold. 8. The malware detection system of claim 6 , wherein the feature extractor selects the predetermined features by performing a two-sample Kolmogorov-Smirnov test on the feature data to determine for each feature and each malware sample whether the feature exhibits a change in distribution after the host computer is infected and, prior to using the CUSUM test, eliminates those features whose data are not informative for malware detection using the CUSUM test. 9. The malware detection system of claim 1 , wherein the predetermined features include at least one of processor hypercalls/second, interrupt hypercalls/second, large page translation lookaside buffer fills/second, percent privileged time, malicious software removal (MSR) accesses cost, central processing unit identifier (CPUID) instructions cost, outbound connections/second, miniport send cycles/second, stack retrieve indication cycles/second, and network driver interface specification (NDIS) receive indication cycles/second. 10. The malware detection system of claim 1 , wherein the feature extractor transforms the predetermined features through feature scaling, in which normalization is used for the feature data and a term frequency—inverse document frequency (TF-IDF) transformer for feature data samples, a feature reduction step in which principal component analysis (PCA) is used to remove redundancy in the feature data samples, and a second feature reduction step in which linear discriminant analysis (LDA) is used to project the feature data samples onto a low-dimensional space that optimally separates the clean and infected datasets. 11. The malware detection system of claim 10 , wherein the TF-IDF transformer scales the feature data samples and deemphasizes the most commonly called system functions, computes products of term frequency (TF) and inverse document frequency (IDF), and scales the term frequency proportionally to a number of calls to a system function per second. 12. The malware detection system of claim 1 , wherein the data fusion center receives decisions from the local detectors regarding the existence of malware in the feature data each sampling period and combines the decisions from the local detectors into a single malware diagnosis for the host computer. 13. The malware detection system of claim 12 , wherein the data fusion center tracks times at which decisions are made by the local detectors. 14. A method for detecting malware that has infected a host computer, comprising the steps of: sensors monitoring the status of the host computer being monitored for malware, including malware that do not propagate through a network; a processor computing streams of feature data from sensor data produced in said monitoring step that corresponds to predetermined features; a local detector detecting malware in each stream of feature data independently; and the processor applying a k out of N fusion rule, where k is a threshold of a number of positive detections and N is a total number of decisions reported by the local detectors, where N=2M and M is the number of local detectors, to malware detection decisions from the malware detecting step to infer whether the host computer is infected by malware. 15. The method of claim 14 , wherein the malware detecting step comprises sequentially monitoring streams of computed feature data and detecting whether a change has occurred in a distribution of the feature data. 16. The method of claim 15 , wherein the malware detecting step further comprises implementing a change-point detection technique comprising a Page's cumulative sum (CUSUM) test. 17. The method of claim 16 , wherein the CUSUM test is implemented as a repeated cumulative log-likelihood ratio test with an adaptive detection threshold. 18. The method of claim 16 , wherein the computing streams of feature data step comprises selecting the predetermined features by performing a two-sample Kolmogorov-Smirnov test on the feature data to determine for each feature sample and each malware sample whether the feature exhibits a change in distribution after the host computer is infected and, prior to using the CUSUM test, eliminating those features whose data are not informative for malware detection using the CUSUM test. 19. The method of claim 14 , wherein the predetermined features include at least one of processor hypercalls/second, interrupt hypercalls/second, large page translation lookaside buffer fills/second, percent privileged time, malicious software removal (MSR) accesses cost, central processing unit identifier (CPUID) instructions cost, outbound connections/second, miniport send cycles/second, stack retrieve indication cycles/second, and network driver interface specification (NDIS) receive indication cycles/second. 20. The method of claim 14 , wherein the computing streams of feature data step includes transforming the predetermined features through feature scaling, in which normalization is used for the feature data and a term frequency—inverse document frequency (TF-IDF) transformer for the feature data samples, a feature reduction step in which principal component analysis (PCA) is used to remove redundancy in the feature data samples, and a second feature reduction step in which linear discriminant analysis (LDA) is used to project the feature data samples onto a low-dimensional space that optimally separates the clean and infected datasets. 21. The me

Assignees

Inventors

Classifications

  • Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems · CPC title

  • Guest-host, i.e. hypervisor is an application program itself, e.g. VirtualBox · CPC title

  • by monitoring network traffic (monitoring network traffic per se H04L43/00) · CPC title

  • by executing in a restricted environment, e.g. sandbox or secure virtual machine · CPC title

  • Isolation or security of virtual machine instances · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9853997B2 cover?
A malware detection system and method detects changes in host behavior indicative of malware execution. The system uses linear discriminant analysis (LDA) for feature extraction, multi-channel change-point detection algorithms to infer malware execution, and a data fusion center (DFC) to combine local decisions into a host-wide diagnosis. The malware detection system includes sensors that monit…
Who is the assignee on this patent?
Univ Drexel
What technology area does this patent fall under?
Primary CPC classification H04L63/145. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Dec 26 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).