Apparatus and methods for classifying senders of unsolicited bulk emails

US9710759B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9710759-B2
Application numberUS-68624010-A
CountryUS
Kind codeB2
Filing dateJan 12, 2010
Priority dateJan 12, 2010
Publication dateJul 18, 2017
Grant dateJul 18, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In accordance with one aspect, methods and apparatus facilitate the filtering of unsolicited bulk electronic mail (email) sent from spammers. A plurality of recipient patterns for a plurality of emails from known spammers is logged. A plurality of recipient patterns for a plurality of emails from known non-spammers is also logged. A probabilistic model for predicting whether an unknown sender identity is a spammer is generated or modified based on the logged recipient patterns for the emails from known spammers and known non-spammers.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer implemented method of filtering unsolicited bulk electronic mail (email), comprising: receiving a plurality of emails associated with a particular sender identifier (ID), the plurality of emails associated with the particular sender ID including emails sent by the particular sender ID; determining whether the emails sent by the particular sender ID comprise unsolicited bulk email based, at least in part, upon one or more sender characteristics, which are associated with the particular sender ID, using a probabilistic model, wherein the sender characteristics of the particular sender ID includes a particular pattern for messages associated with the particular sender ID, wherein the particular pattern includes identification of recipients to which the particular sender ID sends emails, wherein the particular pattern indicates whether any of the recipients of the emails sent by the particular sender ID have mutually exchanged messages with one another, wherein the particular pattern indicates whether a first one of the recipients previously sent a message to a second one of the recipients; and inhibiting the emails sent by the particular sender ID from reaching recipients of such emails if the emails sent by the particular sender ID are determined to be unsolicited bulk emails. 2. The method of claim 1 , wherein the probabilistic model is generated from a training process that is based on a training set of sender characteristics that have been associated with indicators for defining whether specific sender IDs are associated with unsolicited bulk emails. 3. The method of claim 1 , wherein the particular pattern further indicates a geographic distance between a sender associated with the sender ID and the recipients. 4. The method of claim 1 , wherein the mutually exchanged messages comprise instant messages. 5. A computer implemented method of facilitating the filtering of unsolicited bulk electronic mail (email), comprising: logging a plurality of recipient patterns for known spammers based, at least in part, on a plurality of emails associated with the known spammers, the plurality of emails associated with the known spammers including emails sent by the known spammers; generating or modifying a probabilistic model for predicting whether an unknown sender identity is a spammer based, at least in part, on the logged recipient patterns for the known spammers, wherein the logged recipient patterns for each of the known spammers includes identification of recipients to which the known identified spammer sends emails; wherein the logged recipient patterns for each of the known spammers indicate whether any of the recipients of the emails sent by the corresponding one of the known spammers have mutually exchanged messages with one another; and determining whether a particular sender identity is a spammer based, at least in part, upon applying the probabilistic model to logged recipient patterns for the particular sender identity, wherein the logged recipient patterns for the particular sender identity indicate whether any of the recipients of emails sent by the particular sender identity have mutually exchanged messages with one another; wherein one of the logged recipient patterns for one of the known spammers indicates whether a first one of the recipients of the emails sent by the one of the known spammers previously sent a message to a second one of the recipients of the emails sent by the one of the known spammers, and wherein one of the logged recipient patterns for the particular sender identity indicates whether a first one of the recipients of the emails sent by the particular sender identity previously sent a message to a second one of the recipients of the emails sent by the particular sender identity. 6. The method of claim 5 , wherein the unknown sender identity is a sender Internet Protocol (IP) address. 7. The method of claim 5 , wherein the known spammers have been identified, at least in part, by a plurality of recipients of the emails who identify such received emails as spam. 8. The method of claim 5 , wherein each combination of recipients is associated with a score, and wherein the model is configured to determine a total score for each recipient pattern and predict whether each sender is a spammer based, at least in part, on such total score for the recipient pattern. 9. The method of claim 5 , wherein the logged recipient patterns further comprise at least one of a maximum frequency of emails sent by the particular sender ID or a minimum frequency of emails sent by the particular sender ID. 10. The method of claim 5 , further comprising: logging a plurality of recipient patterns for known non-spammers based, at least in part, on a plurality of emails associated with the known non-spammers, the plurality of emails associated with the known non-spammers including emails sent by the known non-spammers; wherein generating or modifying the probabilistic model for predicting whether an unknown sender identity is a spammer is performed further based, at least in part, on the logged recipient patterns for the known non-spammers. 11. The method of claim 5 , further comprising: logging recipient patterns for the particular sender identity based, at least in part, on the emails sent by the particular sender identity. 12. An apparatus comprising at least a processor and a memory, wherein the processor and/or memory are configured to perform the following operations: logging a plurality of recipient patterns for known spammers based, at least in part, on a plurality of emails associated with the known spammers, the plurality of emails including emails sent by the known spammers; generating or modifying a probabilistic model for predicting whether an unknown sender identity is a spammer based, at least in part, on the logged recipient patterns for the known spammers, wherein the logged recipient patterns for each of the known spammers includes identification of recipients to which the known identified spammer sends emails, wherein the logged recipient patterns for each one of the known spammers indicates whether any of the recipients of the emails sent by the one of the known spammers have mutually exchanged messages with one another; and determining a likelihood that a particular sender identity is a spammer based, at least in part, upon applying the probabilistic model to logged recipient patterns for the particular sender identity, wherein the logged recipient patterns for the particular sender identity indicate whether any of the recipients of emails sent by the particular sender identity have mutually exchanged messages with one another; wherein one of the logged recipient patterns for one of the known spammers indicates whether a first one of the recipients of the emails sent by the one of the known spammers previously sent a message to a second one of the recipients of the emails sent by the one of the known spammers, and wherein one of the logged recipient patterns for the particular sender identity indicates whether a first one of the recipients of the emails sent by the particular sender identity previously sent a message to a second one of the recipients of the emails sent by the particular sender identity. 13. The apparatus of claim 12 , wherein the known spammers have been identified, at least in part, by a plurality of recipients of the emails who identify such received emails as spam. 14. The apparatus of claim 12 , wherein the processor and/or memory are further configured for using the model to predict a likelihood of an unknown sender being a spammer based on the unknown sender's r

Assignees

Inventors

Classifications

  • Computer-aided management of electronic mailing [e-mailing] · CPC title

  • G06N99/005Primary

    Physics · mapped topic

  • G06N20/10Primary

    using kernel methods, e.g. support vector machines [SVM] · CPC title

  • G06N20/00Primary

    Machine learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9710759B2 cover?
In accordance with one aspect, methods and apparatus facilitate the filtering of unsolicited bulk electronic mail (email) sent from spammers. A plurality of recipient patterns for a plurality of emails from known spammers is logged. A plurality of recipient patterns for a plurality of emails from known non-spammers is also logged. A probabilistic model for predicting whether an unknown sender i…
Who is the assignee on this patent?
Dasgupta Anirban, Weinberger Kilian Quirin, Koren Yehuda, and 1 more
What technology area does this patent fall under?
Primary CPC classification G06N99/005. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 18 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).