Token-based encryption determination process

US9727491B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9727491-B2
Application numberUS-201615197473-A
CountryUS
Kind codeB2
Filing dateJun 29, 2016
Priority dateSep 17, 2014
Publication dateAug 8, 2017
Grant dateAug 8, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Data storage systems are disclosed for automatically generating encryption rules based on a set of training files that are known to include sensitive information. The system may use a number of heuristic algorithms to generate one or more encryption rules for determining whether a file includes sensitive information. Further, the system may apply the heuristic algorithms to the content of the files, as determined by using natural language processing algorithms, to generate the encryption rules. Moreover, systems are disclosed that are capable of automatically determining whether to encrypt a file based on the generated encryption rules. The content of the file may be determined using natural language processing algorithms and then the encryption rules may be applied to the content of the file to determine whether to encrypt the file.

First claim

Opening claim text (preview).

What is claimed is: 1. A data storage system comprising: a computing system comprising one or more hardware processors programmed to: detect a file interaction event with respect to a file on a storage device; responsive to detecting the file interaction event with respect to the file, access an encryption rule, the encryption rule including a set of rules for determining whether to encrypt files based on a set of context conditions, the set of context conditions including a geographic context; determine a set of data tokens for the file, each of the data tokens comprising a portion of content of the file; apply the encryption rule to the set of data tokens to determine whether the file includes content designated for protection, wherein application of the encryption rule includes: determining whether one or more data tokens from the set of data tokens satisfy the encryption rule; and ceasing said determining whether the one or more data tokens from the set of data tokens satisfy the encryption rule upon identification of a threshold number of data tokens satisfying the encryption rule regardless of whether each data token from the set of data tokens has been processed to determine whether it satisfies the encryption rule; responsive to determining that the file includes content designated for protection: determine a geographic location of the storage device; determine whether the geographic location of the storage device satisfies the geographic context for encrypting the file; and responsive to the geographic location of the storage device satisfying the geographic context, encrypting the file; and responsive to an indication that the file does not include content designated for protection: include the file with a set of training files used to generate one or more encryption rules; and modify the encryption rule based at least in part on the set of training files and the file. 2. The data storage system of claim 1 , wherein the one or more hardware processors are further programmed to determine the set of data tokens for the file by executing one or more natural language processing algorithms with respect to the file. 3. The data storage system of claim 1 , wherein the one or more hardware processors are further programmed to identify the encryption rule from a set of encryption rules using a pattern recognition process with respect to the set of data tokens. 4. The data storage system of claim 1 , wherein determining whether the geographic location of the storage device satisfies the geographic context comprises determining whether the storage device is located within a particular geographic area. 5. The data storage system of claim 1 , wherein determining whether the geographic location of the storage device satisfies the geographic context comprises determining whether the storage device is external to a particular geographic area. 6. The data storage system of claim 1 , wherein determining whether the geographic location of the storage device satisfies the geographic context comprises determining whether the storage device is located within a geographic area associated with a particular entity. 7. The data storage system of claim 1 , wherein the computing system further comprises the storage device. 8. The data storage system of claim 1 , wherein the one or more hardware processors are further programmed to: request confirmation from a user that the file includes content designated for protection; and responsive to receiving from the user the indication that the file does not include content designated for protection, include the file with the set of training files used to generate the one or more encryption rules. 9. The data storage system of claim 1 , wherein detecting the file interaction event further comprises detecting a file context event. 10. The data storage system of claim 9 , wherein the file context event comprises a change in the geographic location of the storage device. 11. A method of performing context-based encryption, the method comprising: detecting, by an encryption system comprising one or more hardware processors, a file interaction event with respect to a file; accessing, by the encryption system, an encryption rule the encryption rule including a set of rules for determining whether to encrypt files based at least in part on a set of context conditions, the set of context conditions including a geographic context; determining, by the encryption system, a set of data tokens for the file, each of the data tokens comprising a portion of content of the file; applying, by the encryption system, the encryption rule to the set of data tokens to determine whether the file includes content designated for protection, wherein applying the encryption rule includes: determining whether one or more data tokens from the set of data tokens satisfy the encryption rule; and ceasing to determine whether the one or more data tokens from the set of data tokens satisfy the encryption rule upon identification of a threshold number of data tokens satisfying the encryption rule regardless of whether each data token from the set of data tokens has been processed to determine whether it satisfies the encryption rule; responsive to determining that the file includes content designated for protection: determining a geographic location of the file; determining whether the geographic location of the file satisfies the geographic context for encrypting the file; and responsive to the geographic location of the file satisfying the geographic context, encrypting, by the encryption system, the file; and responsive to an indication that the file does not include content designated for protection: including the file with a set of training files used to generate one or more encryption rules; and modifying the encryption rule based at least in part on the set of training files and the file. 12. The method of claim 11 , wherein determining the set of data tokens for the file comprises performing one or more natural language processing algorithms with respect to the file. 13. The method of claim 11 , further comprising using a pattern recognition process with respect to the set of data tokens to identify the encryption rule from a set of encryption rules. 14. The method of claim 11 , wherein determining whether the geographic location of the file satisfies the geographic context comprises determining whether the geographic location of the file with respect to a particular geographic area. 15. The method of claim 11 , further comprising: requesting confirmation from a user that the file includes content designated for protection; and responsive to receiving from the user the indication that the file does not include content designated for protection, adding the file to the set of training files used to generate the one or more encryption rules. 16. The method of claim 11 , wherein detecting the file interaction event further comprises detecting a file context event. 17. The method of claim 16 , wherein the file context event comprises a change in the geographic location of the storage device. 18. The method of claim 11 , wherein determining the geographic location of the file comprises determining a geographic location of a storage device that stores the file.

Assignees

Inventors

Classifications

  • in which an application is distributed across nodes in the network (software deployment G06F8/60; multiprogramming arrangements G06F9/46) · CPC title

  • Providing cryptographic facilities or services · CPC title

  • wherein the data content is protected, e.g. by encrypting or encapsulating the payload · CPC title

  • Security improvement · CPC title

  • Encrypted data · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9727491B2 cover?
Data storage systems are disclosed for automatically generating encryption rules based on a set of training files that are known to include sensitive information. The system may use a number of heuristic algorithms to generate one or more encryption rules for determining whether a file includes sensitive information. Further, the system may apply the heuristic algorithms to the content of the f…
Who is the assignee on this patent?
Commvault Systems Inc
What technology area does this patent fall under?
Primary CPC classification G06F12/1408. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 08 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).