Correlating web and email attributes to detect spam

US8938508B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-8938508-B1
Application numberUS-84155910-A
CountryUS
Kind codeB1
Filing dateJul 22, 2010
Priority dateJul 22, 2010
Publication dateJan 20, 2015
Grant dateJan 20, 2015

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer correlates web and email attributes to detect spam. A security module on a client collects attributes of a web site to which an email address was submitted and attributes of an email message sent to the email address that was previously submitted. The security module analyzes the attributes of the web site and the email message to determine whether the email message was sent to the email address responsive to the submission of the email address to the web site. Based on the analysis, the security module determines whether the email message is spam. A machine learning module on a security server establishes training data describing the attributes of the web site to which email addresses were submitted and attributes of legitimate emails received in response to the address submissions. The machine learning module generates an attributes classifier for the security module for spam detection.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of detecting spam email messages comprising: using a computer to perform steps comprising: collecting attributes of a web site to which an email address was submitted; collecting attributes of an email message sent to the email address; identifying a degree of correlation between at least one of the collected attributes of the web site and at least one of the collected attributes of the email message, the identifying comprising using a classifier to analyze the at least one collected attribute of the web site and the at least one collected attribute of the email message, wherein the analysis is based at least in part on a plurality of weights describing different values that represent the relative importances of the collected attributes of the web site and email message, wherein the classifier is generated by training on training data describing attributes of training web sites to which email addresses were submitted and legitimate emails received responsive to the submissions of the email addresses to the training web sites, generating the classifier comprising: generating feature vectors from the training data, the feature vectors having features describing the attributes of the training web sites and having features describing the attributes of the legitimate emails received responsive to the submissions of the email addresses to the training web sites; and training the classifier using the feature vectors, the training causing the classifier to learn weights describing relative importances of the features in recognizing when email messages are received in response to email addresses submitted to web sites; and determining whether the email message is spam responsive at least in part to the degree of correlation, a stronger correlation indicating a decreased likelihood that the email message is spam. 2. The method of claim 1 , wherein collecting attributes of the web site to which an email address was submitted comprises: collecting one or more primary attributes describing the web site; and collecting one or more secondary attributes derived from the primary attributes. 3. The method of claim 2 , wherein the primary attributes describing the web site comprise at least one of an Internet Protocol (IP) address and a Domain Name System (DNS) name of a web server hosting the web site. 4. The method of claim 2 , wherein the secondary attributes derived from the primary attributes comprise at least one of geolocation data describing a geographic location of a web server hosting the web site, whether an IP address of the web server is known to be associated with an Internet Service Provider (ISP), information about a domain name registrar at which the DNS name for the web server is registered, and information about a registrant of the DNS name. 5. The method of claim 1 , wherein collecting attributes of an email message sent to the email address comprises: collecting one or more primary attributes describing the email message; and collecting one or more secondary attributes derived from the primary attributes. 6. The method of claim 5 , wherein the primary attributes describing the email message comprise at least one of a DNS name of a “from” address of the email message, an IP address of a mail server involved in sending the email message, a DNS name of the mail server involved in sending the email message, and attributes of a mail session involved in transmitting the email message. 7. The method of claim 5 , wherein the secondary attributes derived from the primary attributes comprise at least one of geolocation data describing a geographic location of the mail server involved in sending the email message, whether the IP address of the mail server is known to be associated with an ISP, information about a domain name registrar at which the DNS of the web server is registered, and information about a registrant of the DNS name. 8. A non-transitory computer-readable storage medium storing executable computer program instructions for detecting spam email messages, the computer program instructions comprising instructions for: collecting attributes of a web site to which an email address was submitted; collecting attributes of an email message sent to the email address; identifying a degree of correlation between at least one of the collected attributes of the web site and at least one of the collected attributes of the email message, the identifying comprising using a classifier to analyze the at least one collected attribute of the web site and the at least one collected attribute of the email message, wherein the analysis is based at least in part on a plurality of weights describing different values that represent the relative importances of the collected attributes of the web site and email message, wherein the classifier is generated by training on training data describing attributes of training web sites to which email addresses were submitted and legitimate emails received responsive to the submissions of the email addresses to the training web sites, generating the classifier comprising: generating feature vectors from the training data, the feature vectors having features describing the attributes of the training web sites and having features describing the attributes of the legitimate emails received responsive to the submissions of the email addresses to the training web sites; and training the classifier using the feature vectors, the training causing the classifier to learn weights describing relative importances of the features in recognizing when email messages are received in response to email addresses submitted to web sites; and determining whether the email message is spam responsive at least in part to the degree of correlation, a stronger correlation indicating a decreased likelihood that the email message is spam. 9. The computer-readable storage medium of claim 8 , wherein the computer program instructions for collecting attributes of the web site to which an email address was submitted comprise instructions for: collecting one or more primary attributes describing the web site; and collecting one or more secondary attributes derived from the primary attributes. 10. The computer-readable storage medium of claim 9 , wherein the primary attributes describing the web site comprise at least one of an IP address and a DNS name of a web server hosting the web site. 11. The computer-readable storage medium of claim 9 , wherein the secondary attributes derived from the primary attributes comprise at least one of geolocation data describing a geographic location of a web server hosting the web site, whether an IP address of the web server is known to be associated with an ISP, information about a domain name registrar at which the DNS name for the web server is registered, and information about a registrant of the DNS name. 12. The computer-readable storage medium of claim 8 , wherein the computer program instructions for collecting attributes of an email message sent to the email address comprise instructions for: collecting one or more primary attributes describing the email message; and collecting one or more secondary attributes derived from the primary attributes. 13. The computer-readable storage medium of claim 12 , wherein the primary attributes describing the email message comprise at least one of a DNS name of a “from” address of the email message, an IP address of a mail server involved in sending the email message, a DNS name of the mail server involved in sending the email message, and attributes of a mail session involved in transmitting the email message. 14. Th

Assignees

Inventors

Classifications

  • H04L51/212Primary

    using filtering or selective blocking · CPC title

  • G06Q10/107Primary

    Computer-aided management of electronic mailing [e-mailing] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US8938508B1 cover?
A computer correlates web and email attributes to detect spam. A security module on a client collects attributes of a web site to which an email address was submitted and attributes of an email message sent to the email address that was previously submitted. The security module analyzes the attributes of the web site and the email message to determine whether the email message was sent to the e…
Who is the assignee on this patent?
Mccorkendale Bruce, Cooley Shaun, Symantec Corp
What technology area does this patent fall under?
Primary CPC classification H04L51/212. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Jan 20 2015 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).