Space-efficient mail storing and archiving based on communication structure

US9602452B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9602452-B2
Application numberUS-201414485328-A
CountryUS
Kind codeB2
Filing dateSep 12, 2014
Priority dateFeb 4, 2005
Publication dateMar 21, 2017
Grant dateMar 21, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present invention relates to electronic mail. In particular, it relates to a method and system for processing electronic mail, wherein mails are stored in a space efficient way by removing redundancy from the content. Prior art is known for doing a limited version of this on a mail client. In order to provide a method and system which is adequate for server operation it is proposed to perform the steps of: splitting the content of an incoming e-mail into elementary mail segments by parsing and optionally normalizing the e-mail body based on a regular grammar with transduction rules; computing a unique ID for each elementary mail segment; storing the normalized or original form of an elementary mail segment together with a link to its respective parent elementary mail segment in a table in a way retrievable by said unique ID; and reconstructing an original e-mail from a concatenation of a respective sequence of said elementary mail segments wherein the unique ID for each elementary mail segment is used as a key for accessing said table and retrieving the respective elementary mail segment.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for processing e-mail messages in an electronic mail communication system comprising at least one mail server and a plurality of mail clients comprising: receiving an incoming message e-mail message by the at least one mail server, wherein the incoming e-mail message includes at least a message body; splitting up the message body of the incoming e-mail message before forwarding, by identifying portions within the message body that represent elements of an e-mail thread using a combination of support from a messaging system, textual patterns and heuristics, wherein the splitting up yields a plurality of elementary mail segments; computing a unique ID for each of the plurality of elementary mail segments, wherein the unique ID is used as an index for accessing a redundancy-reduced mail store table, and wherein the unique ID for each of the plurality of elementary mail segments is computed from discrete components of each of the plurality of elementary mail segments; and storing each of the plurality of elementary mail segments in the redundancy-reduced mail store table, together with a link to a respective particular parent elementary mail segment for each of the plurality of elementary mail segments, defining one or more ordered sequences of interrelated elementary mail segments, wherein the incoming e-mail message in its original form is not stored, and wherein each of the plurality of elementary mail segments are organized by a plurality of common e-mail threads and stored as nodes in the redundancy-reduced mail store table, each node consisting of textual content of a respective elementary mail segment, a pointer to another node and a list of attachment IDs, and wherein each of the stored plurality of elementary mail segments are unique. 2. The method of claim 1 , further comprising: reconstructing the incoming e-mail message from a concatenation of the plurality of elementary mail segments utilizing the unique ID to traverse nodes associated with one of the plurality of common e-mail threads by the at least one mail server; and forwarding a reconstructed concatenation to at least one of the plurality of mail clients. 3. The method of claim 1 , wherein the splitting up based on a set of rules is created based on an existing set of representative e-mails that identify text patterns based on an assumption that representations of an e-mail header in the message body can be identified by a text pattern. 4. The method of claim 1 , wherein the unique ID is computed based on a combination of components selected from the group consisting of: e-mail header information, date information, and a message body information. 5. The method of claim 1 , wherein the unique ID is computed based on rich text components selected from the group consisting of: color coding, fonts, and font styles. 6. The method of claim 1 , wherein attachments are not considered in the computing the unique ID. 7. The method of claim 1 , wherein attachments of the incoming e-mail message further comprise an attachment ID corresponding with an actual attachment content. 8. A computing device program for processing e-mail messages in an electronic mail communication system comprising at least one mail server and a plurality of mail clients comprising: a non-transitory computer readable medium having computer executable instructions stored thereon for execution by the computer, the computer executable instructions comprising: first programmatic instructions for receiving an incoming message e-mail message by the at least one mail server, wherein the incoming e-mail message includes at least a message body; a second programmatic for splitting up the message body of the incoming e-mail message before forwarding, by identifying portions within the message body that represent elements of an e-mail thread using a combination of support from a messaging system, textual patterns and heuristics, wherein the splitting up yields a plurality of elementary mail segments; third programmatic instructions for computing a unique ID for each of the plurality of elementary mail segments, wherein the unique ID is used as an index for accessing a redundancy-reduced mail store table, and wherein the unique ID for each of the plurality of elementary mail segments is computed from discrete components of each of the plurality of elementary mail segments; and fourth programmatic instructions for storing each of the plurality of elementary mail segments in the redundancy-reduced mail store table, together with a link to a respective particular parent elementary mail segment for each of the plurality of elementary mail segments, defining one or more ordered sequences of interrelated elementary mail segments, wherein the incoming e-mail message in its original form is not stored, and wherein each of the plurality of elementary mail segments are organized by a plurality of common e-mail threads and stored as nodes in the redundancy-reduced mail store table, each node consisting of textual content of a respective elementary mail segment, a pointer to another node and a list of attachment IDs, and wherein each of the stored plurality of elementary mail segments are unique. 9. The computing device program product of claim 8 , further comprising: fifth programmatic instructions for reconstructing the incoming e-mail message from a concatenation of the plurality of elementary mail segments utilizing the unique ID to traverse nodes associated with one of the plurality of common e-mail threads by the at least one mail server; and sixth programmatic instructions for forwarding a reconstructed concatenation to at least one of the plurality of mail clients. 10. The computing device program product of claim 8 , wherein the second programmatic instructions for splitting up based on a set of rules is created based on an existing set of representative e-mails that identify text patterns based on an assumption that representations of an e-mail header in the message body can be identified by a text pattern. 11. The computing device program product of claim 8 , wherein the unique ID is computed based on a combination of components selected from the group consisting of: e-mail header information, date information, and a message body information. 12. The computing device program product of claim 8 , wherein the unique ID is computed based on rich text components selected from the group consisting of: color coding, fonts, and font styles. 13. The computing device program product of claim 8 , wherein attachments are not considered in the computing the unique ID. 14. The computing device program product of claim 8 , wherein attachments of the incoming e-mail message further comprise an attachment ID corresponding with an actual attachment content. 15. A system of computer hardware for processing e-mail messages in an electronic mail communication system comprising: at least one mail server for, receiving an incoming message e-mail message by the at least one mail server, wherein the incoming e-mail message includes at least a message body; splitting up the message body of the incoming e-mail message before forwarding, by identifying portions within the message body that represent elements of an e-mail thread using a combination of support from a messaging system, textual patterns and heuristics, wherein the splitting up yields a plurality of elementary mail segments; computing a unique ID for each of the plurality of elementary mail segments, wherein the unique ID is used as an index for accessing a redundancy-reduced mail store table, and wherein the unique ID for each of

Assignees

Inventors

Classifications

  • G06Q10/107Primary

    Computer-aided management of electronic mailing [e-mailing] · CPC title

  • H04L51/066Primary

    Format adaptation, e.g. format conversion or compression · CPC title

  • Content adaptation, e.g. replacement of unsuitable content · CPC title

  • Electricity · mapped topic

  • Electricity · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9602452B2 cover?
The present invention relates to electronic mail. In particular, it relates to a method and system for processing electronic mail, wherein mails are stored in a space efficient way by removing redundancy from the content. Prior art is known for doing a limited version of this on a mail client. In order to provide a method and system which is adequate for server operation it is proposed to…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06Q10/107. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 21 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).