Information security system and method for phishing website identification based on image hashing
US-2023033134-A1 · Feb 2, 2023 · US
US12401687B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12401687-B2 |
| Application number | US-202318140956-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 28, 2023 |
| Priority date | Apr 28, 2023 |
| Publication date | Aug 26, 2025 |
| Grant date | Aug 26, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
There is disclosed a method of mitigating phishing, including extracting text from a website under analysis; using a spell check algorithm to compare extracted words or phrases to a language dictionary of words or phrases selected from web pages known to be phishing targets, and using a spell counter to count misspell hits from the spell check algorithm; comparing the extracted words or phrases to a case-sensitive usage reference, and using a usage counter to count mismatched usage hits from the case-sensitive usage reference; combining the spell counter and the usage counter into a combined counter; and using the combined counter to identify the website under analysis as a suspected phishing website and taking a phishing mitigation action.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method of mitigating phishing, comprising: extracting text from a website under analysis; using a spell check algorithm to compare extracted words or phrases to a language dictionary of words or phrases selected from web pages known to be phishing targets, and using a spell counter to count misspell hits from the spell check algorithm; comparing the extracted words or phrases to a case-sensitive usage reference, and using a usage counter to count mismatched usage hits from the case-sensitive usage reference; combining the spell counter and the usage counter into a combined counter; and using the combined counter to identify the website under analysis as a suspected phishing site and taking a phishing mitigation action. 2. The method of claim 1 , wherein the spell check algorithm is case insensitive. 3. The method of claim 1 , wherein the spell check algorithm comprises symmetric delete. 4. The method of claim 1 , wherein combining the spelling counter and the usage counter comprises a weighted sum. 5. The method of claim 1 , wherein combining the spelling counter and the usage counter comprises computing a normalized sum. 6. The method of claim 1 , wherein identifying the website under analysis as a suspected phishing site comprises using the combined counter as an input to a phishing analysis engine. 7. The method of claim 1 , wherein identifying the website under analysis as a suspected phishing site comprises using the combined counter as an input to an artificial intelligence algorithm. 8. The method of claim 1 , wherein identifying the website under analysis as a suspected phishing site comprises determining that the site includes two or more misspellings. 9. The method of claim 1 , wherein identifying the website under analysis as a suspected phishing site comprises determining that the website under analysis includes one or more misspellings, and two or more usage mismatches. 10. The method of claim 1 , wherein the phishing mitigation action comprises decorating the website under analysis for further human or machine analysis. 11. The method of claim 1 , wherein the phishing mitigation action comprises blocking the website under analysis. 12. The method of claim 1 , wherein the phishing mitigation action comprises sending a warning message a user. 13. The method of claim 1 , wherein the usage counter is weighted according to a number of case-sensitive variations of a word or phrase that appear in the case-sensitive usage reference. 14. One or more tangible, nontransitory computer-readable media having stored thereon executable instructions to instruct a processor to: extract text from a website under analysis; spell check the extracted text against a language dictionary of words or phrases selected from known non-phishing websites, and accumulate misspell hits into a spelling counter; compare the spell-checked extracted text to a case-sensitive usage dictionary, and accumulate usage mismatches into a usage counter; and based on a combination of the spelling counter and usage counter, identify the website under analysis as a suspected phishing website and take a phishing mitigation action. 15. The one or more tangible, nontransitory computer-readable media of claim 14 , wherein the instructions are further to perform a pre-analysis collection phase before extracting text from the website under analysis, comprising selecting a set of web pages known to be phishing targets, collecting common words and phrases from the set of web pages, and building the language dictionary and case-sensitive usage dictionary. 16. The one or more tangible, nontransitory computer-readable media of claim 15 , wherein building the language dictionary comprises building a histogram of most common words and phrases in the set of web pages. 17. The one or more tangible, nontransitory computer-readable media of claim 15 , wherein the set of web pages comprises web pages determined to be most popular as phishing targets. 18. The one or more tangible, nontransitory computer-readable media of claim 15 , wherein the set of web pages comprises pages from a domain that collect sensitive personal or financial data. 19. A computing ecosystem comprising one or more computing apparatus, comprising: at least one processor circuit; a memory; and instructions stored within the memory to instruct the at least one processor circuit to: collect text from a user input form a website under analysis; spell check the collected text using a case-insensitive spell check algorithm with a language dictionary of words or phrases selected from user input forms of known non-phishing websites, and accumulate misspell hits into a spelling counter; compare the spell-checked extracted text to a case-sensitive usage dictionary, and accumulate usage mismatches into a usage counter; and combine the spelling counter and the usage counter into a combined counter, and based on the combined counter, identify the website under analysis as a suspected phishing website and take a phishing mitigation action. 20. The computing ecosystem of claim 19 , wherein combining the spelling counter and the usage counter comprises a weighted sum or normalized sum.
Dictionaries · CPC title
Parsing · CPC title
Orthographic correction, e.g. spell checking or vowelisation · CPC title
Clustering; Classification · CPC title
Authenticating web pages, e.g. with suspicious links · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.