System and method for creating custom sequence detectors
US-10291639-B1 · May 14, 2019 · US
US12021894B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12021894-B2 |
| Application number | US-201916729295-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 27, 2019 |
| Priority date | Dec 27, 2019 |
| Publication date | Jun 25, 2024 |
| Grant date | Jun 25, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method for phishing detection based on modeling of web page content is discussed. The method includes accessing suspect web page content of a suspect Uniform Resource Locator (URL). The method includes generating an exemplary model based on an exemplary configuration for an indicated domain associated with the suspect URL, where the exemplary model indicates structure and characteristics of an example web page of the indicated domain. The method includes generating a suspect web page model that indicates structure and characteristics of the suspect web page content. The method includes performing scoring functions for the potential phishing web page content based on the suspect web page model, where some of the scoring functions use the exemplary model to perform analysis to generate respective results. The method includes generating a web page content phishing score based on results from the scoring functions.
Opening claim text (preview).
What is claimed is: 1. A method for phishing detection based on modeling of web page content, the method comprising: accessing suspect web page content of a suspect Uniform Resource Locator (URL); splitting the suspect URL into a plurality of features that include one or more of an URL domain, a hostname, a path parameter, or a path query; assigning a partial score to each of the features of the plurality of features; determining a URL-based rules score at least in part by aggregating the partial score assigned to each of the features of the plurality of features; accessing an exemplary model based on an exemplary configuration for an indicated domain associated with the suspect URL, wherein the exemplary model indicates structure and characteristics of an example web page of the indicated domain; accessing a suspect web page model based on the suspect web page content, wherein the suspect web page model indicates structure and characteristics of the suspect web page content; performing a plurality of scoring functions for potential phishing web page content based on the suspect web page model, each of the plurality of scoring functions providing a respective result that includes a respective phishing score, wherein at least one of the plurality of scoring functions uses the exemplary model to perform an analysis to generate one of the respective results; and generating a web page content phishing score based, at least in part, on the URL-based rules score and at least one of the phishing scores generated from the plurality of scoring functions. 2. The method of claim 1 , wherein said performing a plurality of scoring functions comprises: determining a similarity score between the suspect web page content and original content based on an original domain indicated by the suspect URL, wherein the similarity score indicates a degree of similarity between a suspect structure of web page objects of the suspect web page content and an original structure of web page objects of an example web page of the original domain, wherein the suspect structure and the original structure are each in a Document Object Model (DOM) format. 3. The method of claim 1 , wherein said performing a plurality of scoring functions comprises: determining a deception score between the suspect web page content and original content based on an original domain indicated by the suspect URL, wherein the deception score indicates a degree of similarity between characteristics of web page objects of the suspect web page content and characteristics of web page objects of an example web page of the original domain. 4. The method of claim 3 , wherein said determining the deception score comprises determining a degree of similarity between characteristics of at least one cascading style sheet (CSS) associated with the suspect web page content and characteristics of at least one CSS associated with the example web page of the original domain. 5. The method of claim 1 , wherein said accessing the exemplary model comprises accessing an exemplary configuration from a plurality of exemplary configurations based on the indicated domain, and wherein the plurality of scoring functions is performed based on a predefined portion of the potential phishing web page content, and wherein the accessing the exemplary model comprises accessing the predefined portion of the example web page. 6. The method of claim 1 , wherein said accessing the suspect web page model comprises: accessing web page objects, and their respective characteristics, of the suspect web page content to create a suspect structure model; and accessing at least one style sheet associated with the suspect web page content to create a suspect style sheet model; wherein the at least one of the plurality of scoring functions uses at least one of the suspect structure model or the suspect style sheet model to perform said analysis. 7. The method of claim 1 , wherein said accessing the suspect web page model comprises accessing a suspect structure of one or more scripts of the suspect web page content to create a suspect script model, wherein the one or more scripts are different from a hypertext markup language (HTML); and wherein the at least one of the plurality of scoring functions uses the suspect script model to perform said analysis. 8. The method of claim 1 , wherein said accessing the suspect web page model comprises accessing suspect frames of one or more scripts of one or more suspect web page objects to create a suspect frame model, wherein the suspect frame model comprises one or more links that point to an entity that uses a malware and phishing detection and mediation (MAPDAM) platform; and wherein the at least one of the plurality of scoring functions uses the suspect frame model to determine whether the suspect frame model links to an original domain indicated by the suspect URL. 9. The method of claim 1 , wherein said generating the web page content phishing score is further based on a machine learning model determining a degree of similarity between known phishing features and features of the suspect web page model, wherein the machine learning model is trained based on URLs of web pages that have been determined to be phishing web pages. 10. The method of claim 1 , further comprising: determining that the web page content phishing score indicates an undetermined result; accessing suspect textual content of the suspect web page content; scoring the suspect textual content based on textual language metadata of an example web page of the original domain; and revising the web page content phishing score using based on the scoring of the suspect textual content. 11. A system, comprising: a non-transitory memory storing instructions; and one or more hardware processors configured to execute the instructions to cause the system to perform operations comprising: accessing suspect web page content of a suspect Uniform Resource Locator (URL), wherein the accessing comprises extracting one or more link strings or extracting favicon text from the suspect web page content; generating a plurality of features based on the suspect URL; assigning a plurality of partial scores to the plurality of features, respectively; calculating an aggregate score based on the plurality of partial scores; accessing an exemplary model indicating an exemplary configuration of a targeted domain, wherein the exemplary model indicates structure and characteristics of an example web page of the targeted domain; accessing a suspect web page model based on the suspect web page content, wherein the suspect web page model indicates structure and characteristics of the suspect web page content; performing a plurality of scoring functions that compare the suspect web page model and the exemplary model, each of the plurality of scoring functions providing a respective result, wherein the plurality of scoring functions are performed using one or more detectors configured to detect whether a markup language, a structure, or a content of the suspect web page model exhibits characteristics of known phishing web sites; and calculating a web page content phishing score based, at least in part, on the aggregate score and at least one of the phishing scores generated from the plurality of scoring functions. 12. The system of claim 11 , wherein said performing a plurality of scoring functions comprises: determining a similarity score between the suspect web page content and original content based on an original domain indicated by the suspect URL, wherein the similarity score indicates a degree of similarity between a suspect structure of web page objects of the susp
Traffic logging, e.g. anomaly detection · CPC title
Vulnerability analysis · CPC title
above the transport layer · CPC title
Event detection, e.g. attack signature detection · CPC title
service impersonation, e.g. phishing, pharming or web spoofing (detection of rogue wireless access points H04W12/12) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.