What technology area does this patent fall under?

Primary CPC classification H04L63/1483. Mapped technology areas include Electricity.

When was this patent published?

Publication date Thu May 25 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Identifying phishing communications using templates

US2017149824A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2017149824-A1
Application number	US-201715416632-A
Country	US
Kind code	A1
Filing date	Jan 26, 2017
Priority date	May 13, 2015
Publication date	May 25, 2017
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, apparatus, systems, and computer-readable media are provided for determining whether communications are attempts at phishing. In various implementations, a potentially-deceptive communication may be matched to one or more templates of a plurality of templates. Each template may represent content shared among a cluster of communications sent by a legitimate entity. In various implementations, it may be determined that an address associated with the communication is not affiliated with one or more legitimate entities associated with the one or more matched templates. In various implementations, the communication may be classified as a phishing attempt based on the determining.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer-implemented method, comprising: performing, by one or more processors, a first comparison between content of a communication to content of a plurality of templates, wherein each template represents content shared among a cluster of communications sent by a known legitimate entity; identifying, by one or more of the processors based on the first comparison, one or more matching templates from the plurality of templates, wherein the one or more matching templates are associated with one or more known legitimate entities; performing, by one or more of the processors, a second comparison of an address associated with the communication with one or more respective address patterns associated with the one or more known legitimate entities; determining, by one or more of the processors based on the second comparison, that the address associated with the communication does not match the one or more respective address patterns; classifying, by one or more of the processors based on the determining, the communication as a phishing attempt; and discarding or re-routing, by one or more of the processors, the communication based on the classifying. 2 . The computer-implemented method of claim 1 , wherein the determining further comprises, for each of the one or more legitimate entities associated with the one or more matching templates, comparing: a combination of the address associated with the communication and a subject of the communication; to a combination of a pattern of addresses associated with the legitimate entity and a pattern found among subjects of communications sent by the legitimate entity. 3 . The computer-implemented method of claim 1 , wherein performing the first comparison comprises determining respective measures of similarity of the plurality of templates to the communication. 4 . The computer-implemented method of claim 3 , further comprising: ranking the plurality of templates based on their respective measures of similarity; and selecting, as the one or more matching templates, a predetermined number of highest ranking templates. 5 . The computer-implemented method of claim 1 , wherein the address is a linked-to network address contained in the communication. 6 . The computer-implemented method of claim 1 , wherein the address is a sender address. 7 . The computer-implemented method of claim 1 , wherein the address is a reply-to address. 8 . The computer-implemented method of claim 1 , wherein the first comparison comprises comparing one or more n-grams in the communication to one or more n-grams used to index the plurality of templates. 9 . The computer-implemented method of claim 8 , wherein the one or more n-grams used to index the plurality of templates are extracted from the content of the plurality of templates. 10 . The computer-implemented method of claim 1 , wherein the first comparison comprises comparing one or more overlapping n-grams in the communication to one or more overlapping n-grams used to index the plurality of templates. 11 . A computer-implemented method, comprising: matching, by one or more processors, a communication to a first subset of a plurality of templates using a forward index, wherein each template of the plurality of templates represents content shared among a cluster of communications sent by a known legitimate entity, and wherein the forward index is indexed on metadata associated with the plurality of templates; matching, by one or more of the processors, the communication to a second subset of the plurality of templates using a reverse index, wherein the reverse index is indexed on content of the plurality of templates; determining, by one or more of the processors, that there is no intersection between the first subset and the second subset; classifying, by one or more of the processors based on the determining, the communication as a phishing attempt; and discarding or re-routing, by one or more of the processors, the communication based on the classifying. 12 . The computer-implemented method of claim 11 , wherein the metadata associated with each of the plurality of templates comprises a sender address associated with a respective legitimate entity. 13 . The computer-implemented method of claim 11 , wherein the metadata associated with each of the plurality of templates comprises a reply-to address associated with a respective legitimate entity. 14 . The computer-implemented method of claim 11 , wherein the metadata associated with each of the plurality of templates comprises a combination of a sender address and a subject associated with a respective legitimate entity. 15 . The computer-implemented method of claim 11 , wherein the metadata associated with each of the plurality of templates comprises a subject associated with a respective legitimate entity. 16 . The computer-implemented method of claim 11 , wherein the metadata associated with each of the plurality of templates comprises a pattern of sender addresses associated with a respective legitimate entity. 17 . The computer-implemented method of claim 11 , wherein the indexed content of the plurality of templates includes one or more n-grams associated with a respective legitimate entity. 18 . A computer-implemented method, comprising: grouping, by one or more processors, a corpus of communications sent by a plurality of known legitimate entities into a plurality of clusters based at least in part on metadata associated with the corpus of communications, wherein each cluster includes communications sent by a known legitimate entity; generating, by one or more of the processors based on the plurality of clusters, a plurality of templates, wherein each template of the plurality of templates represents content shared among a cluster of the plurality of clusters that includes communications sent by a known legitimate entity; creating, by one or more of the processors, a forward index that is indexed on metadata associated with the plurality of templates; creating, by one or more of the processors, a reverse index, wherein the reverse index is indexed on content of the plurality of templates; and discarding or re-rerouting subsequent communications that match one or more templates in one of the forward or reverse indices, but not the other. 19 . The computer-implemented method of claim 18 , wherein the metadata associated with each of the plurality of templates comprises a combination of a sender address and a subject associated with a respective legitimate entity. 20 . The computer-implemented method of claim 18 , wherein the metadata associated with each of the plurality of templates comprises a subject associated with a respective legitimate entity.

Assignees

Google Inc

Inventors

Classifications

H04L63/1425
Traffic logging, e.g. anomaly detection · CPC title
H04L63/1483Primary
service impersonation, e.g. phishing, pharming or web spoofing (detection of rogue wireless access points H04W12/12) · CPC title
H04L63/20
for managing network security; network security policies in general (filtering policies H04L63/0227) · CPC title
H04L63/0254
Stateful filtering · CPC title

Patent family

Related publications grouped by family.

View patent family 56097284

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2017149824A1 cover?: Methods, apparatus, systems, and computer-readable media are provided for determining whether communications are attempts at phishing. In various implementations, a potentially-deceptive communication may be matched to one or more templates of a plurality of templates. Each template may represent content shared among a cluster of communications sent by a legitimate entity. In various implementa…
Who is the assignee on this patent?: Google Inc
What technology area does this patent fall under?: Primary CPC classification H04L63/1483. Mapped technology areas include Electricity.
When was this patent published?: Publication date Thu May 25 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).