Technologies for analyzing uniform resource locators

US10218716B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10218716-B2
Application numberUS-201615283389-A
CountryUS
Kind codeB2
Filing dateOct 1, 2016
Priority dateOct 1, 2016
Publication dateFeb 26, 2019
Grant dateFeb 26, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Technologies for analyzing a Uniform Resource Locator (URL) include a multi-stage URL analysis system. The multi-stage URL analysis system analyzes the URL using a multi-stage analysis. In the first stage, the multi-stage URL analysis system analyzes the URL using an ensemble lexical analysis. In the second stage, the multi-stage URL analysis system analyzes the URL based on third-party detection results. In the third stage, the multi-stage URL analysis system analyzes the URL based on metadata related to the URL. The multi-stage URL analysis system advances the stages of analysis if a malicious classification score determined by each stage does not satisfy a confidence threshold. The URL may also be selected for additional rigorous analysis using selection criteria not used in by the analysis stages.

First claim

Opening claim text (preview).

The invention claimed is: 1. A Uniform Resource Locator (URL) analysis system to analyze a URL, the URL analysis system comprising: a hardware processor; and a memory having instructions stored therein that, when executed by the hardware processor, cause the URL analysis system to establish a URL lexical ensemble analyzer, a third-party detection analyzer, a local URL metadata analyzer, and a URL additional analysis selector, wherein: the URL lexical ensemble analyzer is to (i) analyze the URL based on an ensemble lexical analysis to determine a first malicious classification score for the URL, wherein the first malicious classification score is indicative of whether the URL is malicious and (ii) determining whether the first malicious classification score satisfies a confidence threshold; the third-party detection analyzer is to (i) analyze the URL based on third-party malicious URL detection results associated with the URL and determined by a third-party source to determine a second malicious classification score for the URL in response to a determination that the first malicious classification score does not satisfy the confidence threshold, wherein the second malicious classification score is indicative of whether the URL is malicious and (ii) determine whether the second malicious classification score satisfies the confidence threshold; the local URL metadata analyzer is to analyze metadata related to the URL to determine a third malicious classification score for the URL in response to a determination that the second malicious classification score does not satisfy the confidence threshold, wherein the third malicious classification score is indicative of whether the URL is malicious; and the URL additional analysis selector is to determine whether to select the URL for additional analysis based on selection criteria not used in (i) the analysis of the URL using the ensemble lexical analysis, (ii) the analysis of the URL based on third-party URL metadata, and (iii) the analysis of the metadata related to the URL. 2. The URL analysis system of claim 1 , wherein to analyze the URL based on an ensemble lexical analysis comprises to: analyze the URL based on a natural language processing algorithm to determine a first lexical analysis score; analyze the URL based on a deep learning algorithm to determine a second lexical analysis score; analyze the URL based on a non-parametric algorithm to determine a third lexical analysis score; and aggregate the first, second, and third lexical analysis scores to determine the first malicious classification score. 3. The URL analysis system of claim 1 , wherein the third-party malicious URL detection results comprises an indication of whether the URL is considered malicious by the third-party source. 4. The URL analysis system of claim 1 , wherein the selection criteria comprises (i) customer feedback related to a classification of the URL, (ii) URL owner feedback related to a classification of the URL, (iii) analysis of the URL using an expanded whitelist or an expanded blacklist, (iv) a determined level of risk to a customer for false classification of the URL, (v) a variance between the first, second, or third malicious classification score of the URL and a third-party malicious classification score for the URL, and (vi) a determined age of the URL. 5. The URL analysis system of claim 1 , further comprising a URL analysis manager to determine whether the URL is malicious based on at least one of the first malicious classification score, the second malicious classification score, and the third malicious classification score. 6. The URL analysis system of claim 1 , further comprising a URL analysis manager to: determine whether the first malicious classification score is ambiguous by determining whether the first malicious classification score falls within a reference score range; and train the URL lexical ensemble analyzer of the URL analysis system based on active learning applied to the URL in response to a determination that the first malicious classification score is ambiguous. 7. The URL analysis system of claim 1 , further comprising a URL analysis manager to: determine whether the second malicious classification score is ambiguous by determining whether the second malicious classification score falls within a reference score range; and train the third-party detection analyzer of the URL analysis system based on active learning or an online algorithm applied to the URL in response to a determination that the second malicious classification score is ambiguous. 8. The URL analysis system of claim 1 , further comprising a URL analysis manager to: determine whether the third malicious classification score is ambiguous by determining whether the third malicious classification score falls within a reference score range; and train the local URL metadata analyzer of the URL analysis system based on active learning or an online algorithm applied to the URL in response to a determination that the third malicious classification score is ambiguous. 9. A method for analyzing a Uniform Resource Locator (URL), the method comprising: analyzing, by a URL analysis system, the URL using an ensemble lexical analysis to determine a first malicious classification score for the URL, wherein the first malicious classification score is indicative of whether the URL is malicious; determining, by the URL analysis system, whether the first malicious classification score satisfies a confidence threshold; analyzing, by the URL analysis system, the URL based on third-party malicious URL detection results associated with the URL and determined by a third-party source to determine a second malicious classification score for the URL in response to a determination that the first malicious classification score does not satisfy the confidence threshold, wherein the second malicious classification score is indicative of whether the URL is malicious; determining, by the URL analysis system, whether the second malicious classification score satisfies the confidence threshold; analyzing, by the URL analysis system, metadata related to the URL to determine a third malicious classification score for the URL in response to a determination that the second malicious classification score does not satisfy the confidence threshold, wherein the third malicious classification score is indicative of whether the URL is malicious; and determining whether to select the URL for additional analysis based on selection criteria not used in (i) the analysis of the URL using the ensemble lexical analysis, (ii) the analysis of the URL based on third-party URL metadata, and (iii) the analysis of the metadata related to the URL. 10. The method of claim 9 , wherein analyzing the URL using an ensemble lexical analysis comprises analyzing the URL using multiple lexical analysis algorithms. 11. The method of claim 10 , wherein analyzing the URL using multiple lexical analysis algorithms comprises: analyzing the URL using a natural language processing algorithm to determine a first lexical analysis score; analyzing the URL using a deep learning algorithm to determine a second lexical analysis score; analyzing the URL using a non-parametric algorithm to determine a third lexical analysis score; and aggregating the first, second, and third lexical analysis scores to determine the first malicious classification score. 12. The method of claim 9 , wherein the third-party malicious URL detection results comprises an indication of whether the URL is considered malicious by the third-party source. 13. The method of claim 11 , wherein determining whether to sel

Assignees

Inventors

Classifications

  • by monitoring network traffic (monitoring network traffic per se H04L43/00) · CPC title

  • Static detection · CPC title

  • Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities · CPC title

  • Physics · mapped topic

  • Ensemble learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10218716B2 cover?
Technologies for analyzing a Uniform Resource Locator (URL) include a multi-stage URL analysis system. The multi-stage URL analysis system analyzes the URL using a multi-stage analysis. In the first stage, the multi-stage URL analysis system analyzes the URL using an ensemble lexical analysis. In the second stage, the multi-stage URL analysis system analyzes the URL based on third-party detecti…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification H04L63/1408. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Feb 26 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).