Adaptable multi-layered storage for deduplicating electronic messages

US11914554B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11914554-B2
Application numberUS-202318103373-A
CountryUS
Kind codeB2
Filing dateJan 30, 2023
Priority dateJun 28, 2019
Publication dateFeb 27, 2024
Grant dateFeb 27, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and systems for improving data back-up, recovery, and search across different cloud-based applications, services, and platforms are described. A data management and storage system may direct compute and storage resources within a customer's cloud-based data storage account to back-up and restore data while the customer retains full control of their data. The data management and storage system may direct the compute and storage resources within the customer's cloud-based data storage account to generate and store secondary layers that are used for generating search indexes, to generate and store shared space layers and user specific layers to facilitate the deduplication of email attachments and text blocks, to perform a controlled restoration of email snapshots such that sensitive information (e.g., restricted keywords) located within stored snapshots remains protected, and to detect and preserve emails that were received or transmitted and then deleted between two consecutive snapshots.

First claim

Opening claim text (preview).

What is claimed is: 1. A method performed by a data management system, the method comprising: acquiring a first electronic message; storing a portion of the first electronic message within a shared space layer; storing, within a first user specific layer for the first electronic message, a first pointer to the portion of the first electronic message within the shared space layer; acquiring a second electronic message, wherein a portion of the second electronic message is identical to the portion of the first electronic message; and storing, within a second user specific layer for the second electronic message, a second pointer to the portion of the first electronic message within the shared space layer based at least in part on the portion of the second electronic message being identical to the portion of the first electronic message. 2. The method of claim 1 , further comprising: determining a quantity of recipients of the first electronic message; and determining an aggregate file size for storing the portion of the first electronic message based at least in part on the quantity of recipients of the first electronic message. 3. The method of claim 2 , further comprising: determining to store the portion of the first electronic message within the shared space layer based at least in part on the aggregate file size for the portion of the first electronic message. 4. The method of claim 2 , wherein the portion of the first electronic message comprises an attachment associated with the first electronic message, and wherein the aggregate file size for the portion of the first electronic message is based at least in part on a file size of the attachment. 5. The method of claim 1 , further comprising: determining an aggregate file size for storing a text block within the first electronic message; determining that the text block should be stored within the shared space layer based at least in part on the aggregate file size for the text block; storing the text block within the shared space layer based at least in part on determining that the text block should be stored within the shared space layer; and storing, within the shared space layer, a third pointer to the text block. 6. The method of claim 1 , further comprising: determining that a second text block within the second electronic message is not identical to any text block stored within the shared space layer; determining that the second text block should be stored within the shared space layer based at least in part on the second text block being not identical to any text block stored within the shared space layer; storing the second text block within the shared space layer based at least in part on determining that the second text block should be stored within the shared space layer; and storing, within the shared space layer, a fourth pointer to the second text block. 7. The method of claim 6 , further comprising: determining that the second electronic message was sent to at least a threshold quantity of email addresses, wherein determining that the second text block should be stored within the shared space layer is further based at least in part on the second electronic message being sent to at least the threshold quantity of email addresses. 8. The method of claim 1 , wherein the portion of the first electronic message and the portion of the second electronic message comprise an attachment or text block that is common to both the first electronic message and the second electronic message. 9. The method of claim 8 , wherein the first electronic message corresponding to the first user specific layer and the second electronic message corresponding to the second user specific layer are from a same email mailbox. 10. The method of claim 1 , wherein the first electronic message corresponding to the first user specific layer is from a first email mailbox and the second electronic message corresponding to the second user specific layer is from a second email mailbox. 11. A data management system, comprising: memory; and one or more processors coupled with the memory, the one or more processors configured to cause the data management system to: acquire a first electronic message; store a portion of the first electronic message within a shared space layer; store, within a first user specific layer for the first electronic message, a first pointer to the portion of the first electronic message within the shared space layer; acquire a second electronic message, wherein a portion of the second electronic message is identical to the portion of the first electronic message; and store, within a second user specific layer for the second electronic message, a second pointer to the portion of the first electronic message within the shared space layer based at least in part on the portion of the second electronic message being identical to the portion of the first electronic message. 12. The data management system of claim 11 , wherein the one or more processors are further configured to cause the data management system to: determine a quantity of recipients of the first electronic message; and determine an aggregate file size for storing the portion of the first electronic message based at least in part on the quantity of recipients of the first electronic message. 13. The data management system of claim 12 , wherein the one or more processors are further configured to cause the data management system to: determine to store the portion of the first electronic message within the shared space layer based at least in part on the aggregate file size for the portion of the first electronic message. 14. The data management system of claim 12 , wherein the portion of the first electronic message comprises an attachment associated with the first electronic message, and wherein the aggregate file size for the portion of the first electronic message is based at least in part on a file size of the attachment. 15. The data management system of claim 11 , wherein the one or more processors are further configured to cause the data management system to: determine an aggregate file size for storing a text block within the first electronic message; determine that the text block should be stored within the shared space layer based at least in part on the aggregate file size for the text block; store the text block within the shared space layer based at least in part on determining that the text block should be stored within the shared space layer; and store, within the shared space layer, a third pointer to the text block. 16. The data management system of claim 11 , wherein the one or more processors are further configured to cause the data management system to: determine that a second text block within the second electronic message is not identical to any text block stored within the shared space layer; determine that the second text block should be stored within the shared space layer based at least in part on the second text block being not identical to any text block stored within the shared space layer; store the second text block within the shared space layer based at least in part on determining that the second text block should be stored within the shared space layer; and store, within the shared space layer, a fourth pointer to the second text block. 17. The data management system of claim 16 , wherein the one or more processors are further configured to cause the data management system to: determine that the second electronic message was sent to at least a threshold quantity of email addresses,

Assignees

Inventors

Classifications

  • De-duplication implemented within the file system, e.g. based on file segments (de-duplication techniques in storage systems for the management of data blocks G06F3/0641) · CPC title

  • Hash-based (content-based indexing of textual data G06F16/31) · CPC title

  • Annexed information, e.g. attachments · CPC title

  • Mailbox-related aspects, e.g. synchronisation of mailboxes · CPC title

  • for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11914554B2 cover?
Methods and systems for improving data back-up, recovery, and search across different cloud-based applications, services, and platforms are described. A data management and storage system may direct compute and storage resources within a customer's cloud-based data storage account to back-up and restore data while the customer retains full control of their data. The data management and storage …
Who is the assignee on this patent?
Rubrik Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/1748. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 27 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).