Malicious software detection

US10650146B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-10650146-B1
Application numberUS-201916372230-A
CountryUS
Kind codeB1
Filing dateApr 1, 2019
Priority dateDec 12, 2018
Publication dateMay 12, 2020
Grant dateMay 12, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An amount of data change associated with a version of a content file with respect to one or more previous versions of the content file is determined. The amount of change associated with the version of the content file is determined using a tree data structure associated with the content file that is stored on a storage cluster. One or more statistics associated with backup snapshot are provided to a server. The server is configured to determine that the amount of data change associated with the version of the content file is anomalous based in part on the one or more statistics associated with the backup snapshot. A notification that data associated with the backup snapshot is potentially infected by malicious software is received from the server. The version of the content file is indicated as being potentially infected by malicious software.

First claim

Opening claim text (preview).

What is claimed is: 1. A system, comprising: a processor configured to: receive an incremental backup snapshot that includes data associated with a version of a content file; determine, by a processor of a storage cluster, an amount of data change associated with the version of the content file included in the incremental backup snapshot with respect to one or more previous versions of the content file, wherein the incremental backup snapshot includes data associated with a primary system, wherein the data associated with the primary system is backed up from the primary system to the storage cluster, wherein the amount of data change associated with the version of the content file is determined using a tree data structure associated with the content file that is stored on the storage cluster; provide to a server one or more statistics associated with the incremental backup snapshot, wherein the one or more statistics associated with the incremental backup snapshot include a total amount of deduplication associated with the data included in the incremental backup snapshot, wherein the server is configured to determine that the amount of data change associated with the version of the content file is anomalous based in part on the one or more statistics associated with the incremental backup snapshot, wherein the server is configured to determine that the amount of data change associated with the version of the content file is anomalous in the event the total amount of deduplication associated with the data included in the incremental backup snapshot is less than a deduplication threshold, wherein an anomalous amount of data change associated with the version of the content file indicates that the data associated with the incremental backup snapshot is potentially infected by malicious software; receive from the server a notification that data associated with the incremental backup snapshot is potentially infected by malicious software; and identify the version of the content file as being potentially infected by malicious software; and a memory coupled to the processor and configured to provide the processor with instructions. 2. The system of claim 1 , wherein the content file is a virtual machine container file. 3. The system of claim 1 , wherein the processor is further configured to provide to the primary system a notification that the primary system is potentially infected with malicious software, wherein the notification includes a link to a set of clean versions of the content file. 4. The system of claim 3 , wherein the processor is further configured to: receive a selection of a clean version of the content file that is included in the set of clean versions of the content file; and restore the clean version of the content file to the primary system. 5. The system of claim 1 , wherein the processor is further configured to provide to the primary system a notification that the primary system is potentially infected with malicious software, wherein the notification includes a link to a set of clean backup snapshots that include corresponding clean versions of the content file. 6. The system of claim 5 , wherein the processor is further configured to: receive a selection of a clean backup snapshot that is included in the set of clean backup snapshots; and restore the clean backup snapshot to the primary system. 7. The system of claim 1 , wherein to determine the amount of data change associated with the version of the content file stored on the storage cluster with respect to the one or more previous versions of the content file, the processor is further configured to: traverse the tree data structure corresponding to the version of the content file; traverse one or more tree data structures corresponding to the one or more previous versions of the content file; determine a first amount of data change associated with the version of the content file based on a traversal of the tree data structure corresponding to the version of the content file; and determine corresponding amounts of data change associated with the one or more previous versions of the content file based on corresponding traversals of the one or more tree data structures corresponding to the one or more previous versions of the content file. 8. The system of claim 1 , wherein the amount of data change associated with the version of the content file is determined to be anomalous in the event the amount of data change associated with the content file is greater than a rate of change per backup snapshot by a threshold. 9. The system of claim 8 , wherein the rate of change per backup snapshot is a historical rate of change, a seasonality rate of change, or a trending rate of change. 10. The system of claim 8 , wherein the determined amount of data change associated with the version of the content file includes an amount associated with a sub-portion of the content file. 11. The system of claim 10 , wherein the processor is configured to determine that the amount of data change associated with the version of the content file is anomalous in the event the sub-portion amount of the content file is greater than a rate of change for the sub-portion of the content file per backup snapshot by a threshold. 12. The system of claim 1 , wherein a retention policy associated with the content file is placed on hold. 13. The system of claim 1 , wherein the processor is configured to: receive a selection of a backup snapshot to restore to the primary system; and provide a notification that the backup snapshot is potentially infected by malicious software. 14. The system of claim 1 , where a portion of the data change associated with the version of the content file is encrypted. 15. The system of claim 1 , wherein the server is configured to determine that the amount of data change associated with the version of the content file is anomalous based on a plurality of statistics associated with the backup snapshot that includes the version of the content file. 16. The system of claim 15 , wherein the plurality of statistics at least include a number of files associated with the backup snapshot that were added, deleted, or modified, and an entropy value associated with changed data included in the backup snapshot. 17. A method, comprising: receiving an incremental backup snapshot that includes data associated with a version of a content file; determining, by a processor of a storage cluster, an amount of data change associated with the version of the content file included in the incremental backup snapshot with respect to one or more previous versions of the content file, wherein the incremental backup snapshot includes data associated with a primary system, wherein the data associated with the primary system is backed up from the primary system to the storage cluster, wherein the amount of data change associated with the version of the content file is determined using a tree data structure associated with the content file that is stored on the storage cluster; providing to a server one or more statistics associated with the incremental backup snapshot wherein the one or more statistics associated with the incremental backup snapshot include a total amount of deduplication associated with the data included in the incremental backup snapshot, wherein the server is configured to determine that the amount of data change associated with the version of the content file is anomalous based in part on the one or more statistics associated with the incremental backup snapshot, wherein the server is configured to determine that

Assignees

Inventors

Classifications

  • G06F21/565Primary

    by checking file integrity · CPC title

  • Ensuring data consistency and integrity · CPC title

  • Document management systems · CPC title

  • G06F21/562Primary

    Static detection · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10650146B1 cover?
An amount of data change associated with a version of a content file with respect to one or more previous versions of the content file is determined. The amount of change associated with the version of the content file is determined using a tree data structure associated with the content file that is stored on a storage cluster. One or more statistics associated with backup snapshot are provide…
Who is the assignee on this patent?
Cohesity Inc
What technology area does this patent fall under?
Primary CPC classification G06F21/565. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 12 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).