Systems and methods for detecting data exfiltration

US2019130100A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2019130100-A1
Application numberUS-201715802262-A
CountryUS
Kind codeA1
Filing dateNov 2, 2017
Priority dateNov 2, 2017
Publication dateMay 2, 2019
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for detecting data exfiltration using domain name system (DNS) queries include, in various embodiments, performing operations that include parsing a DNS query to determine whether that DNS query is likely to contain hidden data that is being exfiltrated from a system or network. Statistical methods can be used to analyze the DNS query to determine a likelihood whether each of a plurality of segments of the DNS query are indicative of data exfiltration methods. If one or multiple DNS queries are deemed suspicious based on the analysis, a security action on the DNS query can be performed, including sending an alert and/or blocking the DNS query from being forwarded.

First claim

Opening claim text (preview).

What is claimed is: 1 . A system comprising: a non-transitory memory; and one or more hardware processors coupled to the non-transitory memory and configured to read instructions from the non-transitory memory to cause the system to perform operations comprising: receiving a domain name system (DNS) query; parsing the DNS query to determine a plurality of segments within the DNS query; determining a likelihood that each of the plurality of segments appears in a training set of DNS queries; aggregating the likelihoods of each of the plurality of segments; comparing the aggregate of the likelihoods to a cutoff threshold; and in response to determining that the aggregate of the likelihoods is below the cutoff threshold: determining that the DNS query is suspicious; and responsive to one or more previous DNS queries also being suspicious, performing a security action. 2 . The system of claim 1 , wherein each of the plurality of segments comprises one or more symbols from a lowest level subdomain string in the DNS query. 3 . The system of claim 1 , wherein each of the plurality of segments comprises a plurality of adjacent symbols from a lowest level subdomain string in the DNS query. 4 . The system of claim 1 , wherein the likelihood is a probability. 5 . The system of claim 1 , wherein aggregating the likelihoods of each of the plurality of segments comprises determining a cross entropy between likelihoods that each of the plurality of segments is included in the DNS query and the likelihood that each of the plurality of segments appears in the training set of DNS queries. 6 . The system of claim 1 , wherein in response to determining that the aggregate of the likelihoods is below the cutoff threshold the operations further comprise: incrementing a counter; comparing the counter to a count threshold; and in response to the counter exceeding the count threshold, performing the security action. 7 . The system of claim 6 , wherein: the counter is associated with a domain; and the counter is reset at periodic intervals. 8 . The system of claim 1 , wherein the cutoff threshold is determined according to a confidence interval based on a distribution of segments in the training set of DNS queries. 9 . The system of claim 1 , wherein the security action comprises one or more of sending an alert or blocking forwarding of the DNS query. 10 . The system of claim 1 , wherein the system is a firewall. 11 . A method of detecting exfiltration, the method comprising: receiving a domain name system (DNS) query; parsing a lowest level subdomain string from the DNS query to determine a plurality of segments; determining a probability that each of the plurality of segments occurs in a training set of legitimate DNS queries; determining a likelihood of legitimacy measure by aggregating the probabilities of each of the plurality of segments; comparing the likelihood of legitimacy measure to a cutoff threshold; and in response to determining that the likelihood of legitimacy measure is below the cutoff threshold: determining that the DNS query is suspicious; and responsive to one or more previous DNS queries also being suspicious, sending an alert that exfiltration is suspected. 12 . The method of claim 11 , wherein each of the plurality of segments comprises a plurality of adjacent symbols from the lowest level subdomain string. 13 . The method of claim 11 , wherein aggregating the probabilities of each of the plurality of segments comprises determining a cross entropy between a probability by which each of the plurality of segments occurs in the lowest level subdomain string and the probabilities that each of the plurality of segments occurs in lowest level subdomain strings of each of the DNS queries in the training set. 14 . The method of claim 11 , wherein in response to determining that the aggregate of the likelihoods is below the cutoff threshold: incrementing a counter specific to a domain in the DNS query that does not include a lowest level subdomain of the lowest level subdomain string; comparing the counter to a count threshold; and in response to the counter exceeding the count threshold, sending the alert that exfiltration is suspected. 15 . The method of claim 11 , wherein the cutoff threshold is determined according to a desired confidence interval based on a probability distribution of segments in lowest level subdomain strings of each of the DNS queries in the training set. 16 . A non-transitory machine-readable medium having stored thereon machine-readable instructions executable to cause a machine to perform operations comprising: receiving a domain name system (DNS) query; parsing the DNS query to determine a plurality of candidate segments, each of the candidate segments including characters from a lowest level subdomain string of the DNS query; determining a probability that each of the plurality of candidate segments occurs in a training set of legitimate DNS queries; determining a cross entropy based on the probabilities of each of the plurality of candidate segments; comparing the cross entropy to a cutoff threshold; and in response to determining that the cross entropy is below the cutoff threshold: determining that the DNS query is suspicious; and responsive to one or more previous DNS queries also being suspicious, sending an alert that exfiltration is suspected incrementing a counter; comparing the counter to a count threshold; and in response to the counter exceeding the count threshold, sending an alert that exfiltration is suspected. 17 . The machine-readable medium of claim 16 , wherein each of the plurality of candidate segments includes adjacent characters from the lowest level subdomain string. 18 . The machine-readable medium of claim 16 , wherein in response to determining that the cross entropy is below the cutoff threshold the operations further comprise: incrementing a counter, the counter corresponding to a number of DNS queries whose aggregated probabilities are below the cutoff threshold during a known interval; comparing the counter to a count threshold; and in response to the counter exceeding the count threshold, sending an alert that exfiltration is suspected. 19 . The machine-readable medium of claim 16 , wherein the operations further comprise: parsing lowest level subdomain strings from each of the legitimate DNS queries in the training set of legitimate DNS queries to determine a plurality of legitimate segments; and determining a probability distribution for the plurality of legitimate segments; wherein determining the probability that each of the plurality of candidate segments occurs in the training set of legitimate DNS queries comprises determining the probability of each of the candidate segments according to the probability distribution. 20 . The machine-readable medium of claim 19 , wherein: the cutoff threshold is determined according to a desired confidence interval based on the probability distribution for the plurality of legitimate segments; and the desired confidence interval is selected so that a desired percentage of the legitimate DNS queries in the training set would have a corresponding aggregated probability of segments in a corresponding lowest level subdomain string at or above the cutoff threshold.

Assignees

Inventors

Classifications

  • Physics · mapped topic

  • Tools and structures for managing or administering access control systems · CPC title

  • Protecting data integrity, e.g. using checksums, certificates or signatures · CPC title

  • G06F21/554Primary

    involving event detection and direct action · CPC title

  • Query processing · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2019130100A1 cover?
Systems and methods for detecting data exfiltration using domain name system (DNS) queries include, in various embodiments, performing operations that include parsing a DNS query to determine whether that DNS query is likely to contain hidden data that is being exfiltrated from a system or network. Statistical methods can be used to analyze the DNS query to determine a likelihood whether each o…
Who is the assignee on this patent?
Paypal Inc
What technology area does this patent fall under?
Primary CPC classification G06F21/554. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu May 02 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).