Phishing detection of uncategorized URLs using heuristics and scanning
US-2021377301-A1 · Dec 2, 2021 · US
US12452259B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12452259-B2 |
| Application number | US-202117549313-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 13, 2021 |
| Priority date | Jun 28, 2018 |
| Publication date | Oct 21, 2025 |
| Grant date | Oct 21, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Examples of the present disclosure describe systems and methods for evaluating malicious web content for associated threats using specialized web crawling techniques. A seed resource identifier is evaluated to determine a second resource identifier associated with the seed resource identifier. A resource corresponding to the second resource identifier is scanned to identify a third resource identifier. The third resource identifier is processed with a machine learning model to classify the third resource identifier according to a classification representing a predicted level of threat. The machine learning model trained to classify resource identifiers into a plurality of classifications. A corrective action can be executed based on the classification of the third resource identifier.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method comprising: receiving a seed resource identifier; determining a related resource identifier associated with the seed resource identifier; evaluating the related resource identifier to determine a classification of the related resource identifier, evaluating the related resource identifier comprising: determining a third resource identifier associated with the related resource identifier, wherein determining the third resource identifier comprises scanning a related resource corresponding to the related resource identifier to determine a resource made available via the related resource identifier in a webpage corresponding to the related resource identifier; and processing the third resource identifier with a machine learning model to classify the third resource identifier according to a classification representing a predicted level of threat, the machine learning model trained to classify resource identifiers into a plurality of classifications, the plurality of classifications comprising: a first category for safe resource identifiers; and a plurality of additional categories, the plurality of additional categories representing different levels of threat; classifying the related resource identifier based on a classification of the third resource identifier; and executing a corrective action based on the classification of the related resource identifier, wherein executing the corrective action comprises modifying at least one of a permission or a privilege level. 2. The method of claim 1 , further comprising classifying the related resource identifier as malicious based on the classification of the third resource identifier. 3. The method of claim 1 , further comprising: based on a determination that the third resource identifier is classified as malicious, providing the third resource identifier to a web crawler to identify further resource identifiers associated with the third resource identifier. 4. The method of claim 1 , wherein evaluating the related resource identifier comprises providing the related resource identifier to a web crawler. 5. The method of claim 1 , wherein the corrective action comprises quarantining a file. 6. The method of claim 1 , wherein the corrective action comprises initiating anti-exploit processing. 7. The method of claim 1 , wherein the corrective action comprises terminating an executing process. 8. The method of claim 1 , wherein the corrective action comprises installing a security patch. 9. The computer-implemented method of claim 1 , wherein determining the related resource identifier comprises investigating at least one of: a root domain and sub-domain of the seed resource identifier, internal and external links associated with the seed resource identifier, an IP address hosting the seed resource identifier, a geolocation of an IP address associated with the seed resource identifier, or other domains owned by a resource. 10. A non-transitory computer-readable media storing computer-executable instructions, the computer-executable instructions comprising instructions for: receiving a seed resource identifier; determining a related resource identifier associated with the seed resource identifier; evaluating the related resource identifier to determine a classification of the related resource identifier, evaluating the related resource identifier comprising: determining a third resource identifier associated with the related resource identifier, wherein determining the third resource identifier comprises scanning a related resource corresponding to the related resource identifier to determine a resource made available via the related resource identifier in a webpage corresponding to the related resource identifier; and processing the third resource identifier with a machine learning model to classify the third resource identifier according to a classification representing a predicted level of threat, the machine learning model trained to classify resource identifiers into a plurality of classifications, the plurality of classifications comprising: a first category for safe resource identifiers; and a plurality of additional categories, the plurality of additional categories representing different levels of threat; classifying the related resource identifier based on a classification of the third resource identifier; and executing a corrective action based on the classification of the related resource identifier, wherein executing the corrective action comprises modifying at least one of a permission or a privilege level. 11. The non-transitory computer-readable media of claim 10 , further comprising classifying the related resource identifier as malicious based on the classification of the third resource identifier. 12. The non-transitory computer-readable media of claim 10 , further comprising instructions for: based on a determination that the third resource identifier is classified as malicious, providing the third resource identifier to a web crawler to identify further resource identifiers associated with the third resource identifier. 13. The non-transitory computer-readable media of claim 10 , wherein evaluating the related resource identifier comprises providing the related resource identifier to a web crawler. 14. The non-transitory computer-readable media of claim 10 , wherein the corrective action comprises quarantining a file. 15. The non-transitory computer-readable media of claim 10 , wherein the corrective action comprises initiating anti-exploit processing. 16. The non-transitory computer-readable media of claim 10 , wherein the corrective action comprises terminating an executing process. 17. The non-transitory computer-readable media of claim 10 , wherein the corrective action comprises installing a security patch. 18. The non-transitory computer-readable media of claim 10 , wherein determining the related resource identifier comprises investigating at least one of: a root domain and sub-domain of the seed resource identifier, internal and external links associated with the seed resource identifier, an IP address hosting the seed resource identifier, a geolocation of an IP address associated with the seed resource identifier, or other domains owned by a resource.
Indexing; Web crawling techniques · CPC title
Traffic logging, e.g. anomaly detection · CPC title
using information identifiers, e.g. uniform resource locators [URL] · CPC title
Event detection, e.g. attack signature detection · CPC title
service impersonation, e.g. phishing, pharming or web spoofing (detection of rogue wireless access points H04W12/12) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.