Adaptable multi-layered storage for deduplicating electronic messages

US11157451B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11157451-B2
Application numberUS-201916456983-A
CountryUS
Kind codeB2
Filing dateJun 28, 2019
Priority dateJun 28, 2019
Publication dateOct 26, 2021
Grant dateOct 26, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and systems for improving data back-up, recovery, and search across different cloud-based applications, services, and platforms are described. A data management and storage system may direct compute and storage resources within a customer's cloud-based data storage account to back-up and restore data while the customer retains full control of their data. The data management and storage system may direct the compute and storage resources within the customer's cloud-based data storage account to generate and store secondary layers that are used for generating search indexes, to generate and store shared space layers and user specific layers to facilitate the deduplication of email attachments and text blocks, to perform a controlled restoration of email snapshots such that sensitive information (e.g., restricted keywords) located within stored snapshots remains protected, and to detect and preserve emails that were received or transmitted and then deleted between two consecutive snapshots.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for operating a data management system, comprising: acquiring an electronic message; identifying an attachment from the electronic message; determining a number of recipients of the electronic message; determining an aggregate file size for the attachment based on the number of recipients of the electronic message; detecting that the attachment should be stored within a shared space layer based on the aggregate file size for the attachment; storing the attachment within the shared space layer; generating and storing a user specific layer for the electronic message with a pointer to the attachment within the shared space layer; acquiring a second electronic message; detecting that a second attachment from the second electronic message is identical to the attachment stored within the shared space layer; and generating and storing a second user specific layer for the second electronic message with a third pointer to the attachment within the shared space layer. 2. The method of claim 1 , wherein: the electronic message is acquired from a first email mailbox; and the second electronic message is acquired from the first email mailbox. 3. The method of claim 1 , wherein: the electronic message is associated with a first email mailbox; and the second electronic message is associated with a second email mailbox different from the first email mailbox. 4. The method of claim 1 , further comprising: identifying a text block within the electronic message; determining an aggregate data size for the text block based on the number of recipients of the electronic message; detecting that the text block should be stored within the shared space layer based on the aggregate data size for the text block; storing the text block within the shared space layer; and updating the user specific layer for the electronic message with a second pointer to the text block within the shared space layer. 5. The method of claim 1 , further comprising: detecting that a second text block within the second electronic message is not identical to any text block stored within the shared space layer; storing the second text block within the shared space layer; and updating the second user specific layer for the second electronic message with a fourth pointer to the second text block within the shared space layer. 6. The method of claim 5 , further comprising: detecting that the second electronic message was sent to more than a threshold a number of email addresses prior to storing the second text block within the shared space layer; and storing the second text block within the shared space layer in response to detecting that the second electronic message was sent to more than the threshold number of email addresses. 7. The method of claim 1 , wherein: the attachment comprises an image file. 8. The method of claim 1 , wherein: the determining the aggregate file size for the attachment includes multiplying a file size of the attachment by the number of recipients of the electronic message. 9. The method of claim 1 , wherein: the determining the number of recipients of the electronic message includes identifying a number of unique email addresses for the electronic message. 10. The method of claim 1 , wherein: the detecting that the second attachment from the second electronic message is identical to the attachment stored within the shared space layer includes comparing one or more hash values generated from the second attachment with one or more other hash values generated from the attachment stored within the shared space layer. 11. A data management system, comprising: a memory configured to store an electronic message; and one or more processors in communication with the memory configured to acquire the electronic message and identify an attachment from the electronic message, the one or more processors configured to determine a number of recipients of the electronic message and determine an aggregate file size for the attachment based on the number of recipients of the electronic message, the one or more processors configured to detect that the attachment should be stored within a shared space layer based on the aggregate file size for the attachment and store the attachment within the shared space layer, the one or more processors configured to generate a user specific layer for the electronic message with a pointer to the attachment within the shared space layer, the one or more processors configured to acquire a second electronic message and detect that a second attachment from the second electronic message is identical to the attachment stored within the shared space layer, the one or more processors configured to generate a second user specific layer for the second electronic message with a third pointer to the attachment within the shared space layer, the second user specific layer is stored using a second type of data storage and the shared space layer is stored using a first type of data storage different from the second type of data storage. 12. The data management system of claim 11 , wherein: the electronic message is acquired from a first email mailbox; and the second electronic message is acquired from the first email mailbox. 13. The data management system of claim 11 , wherein: the electronic message is acquired from a first email mailbox; and the second electronic message is acquired from a second email mailbox different from the first email mailbox. 14. The data management system of claim 11 , wherein: the one or more processors configured to identify a text block within the electronic message and determine an aggregate data size for the text block based on the number of recipients of the electronic message, the one or more processors configured to detect that the text block should be stored within the shared space layer based on the aggregate data size for the text block and store the text block within the shared space layer, the one or more processors configured to update the user specific layer for the electronic message with a second pointer to the text block within the shared space layer. 15. The data management system of claim 11 , wherein: the one or more processors configured to detect that a second text block within the second electronic message is not identical to any text block stored within the shared space layer and store the second text block within the shared space layer, the one or more processors configured to update the second user specific layer for the second electronic message with a fourth pointer to the second text block within the shared space layer. 16. The data management system of claim 15 , wherein: the one or more processors configured to detect that the second electronic message was sent to more than a threshold a number of email addresses prior to the second text block being stored within the shared space layer, the one or more processors configured to store the second text block within the shared space layer in response to detection that the second electronic message was sent to more than the threshold number of email addresses. 17. The data management system of claim 11 , wherein: the attachment comprises an image file; the first type of data storage comprises blob storage; and the second type of data storage comprises block storage. 18. The data management system of claim 11 , wherein: the one or more processors configured to determine the aggregate file size for the attachment via multiplication of a file size of the attachment with the number of r

Assignees

Inventors

Classifications

  • Establishing a time schedule for servicing the requests · CPC title

  • Mailbox-related aspects, e.g. synchronisation of mailboxes · CPC title

  • for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS] · CPC title

  • Annexed information, e.g. attachments · CPC title

  • De-duplication implemented within the file system, e.g. based on file segments (de-duplication techniques in storage systems for the management of data blocks G06F3/0641) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11157451B2 cover?
Methods and systems for improving data back-up, recovery, and search across different cloud-based applications, services, and platforms are described. A data management and storage system may direct compute and storage resources within a customer's cloud-based data storage account to back-up and restore data while the customer retains full control of their data. The data management and storage …
Who is the assignee on this patent?
Rubrik Inc
What technology area does this patent fall under?
Primary CPC classification H04L67/1097. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Oct 26 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).