Distribution-Based Analysis Of Queries For Anomaly Detection With Adaptive Thresholding

US2019102553A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2019102553-A1
Application numberUS-201816143239-A
CountryUS
Kind codeA1
Filing dateSep 26, 2018
Priority dateSep 30, 2017
Publication dateApr 4, 2019
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques for detecting an anomaly in queries of a relational database are disclosed. The techniques include identifying a set of attribute values from a query for accessing data within a database. Based on previously received queries, at least one of a joint probability for the set of attribute values or individual probabilities for the set of attribute values is determined. When at least one of the joint probability for the set of attribute values or an individual probability for one or more attribute values in the set of attribute values does not satisfy a probability cutoff, an indication that the query is anomalous is outputted.

First claim

Opening claim text (preview).

What is claimed is: 1 . A non-transitory computer readable medium comprising instructions which, when executed by one or more hardware processors, causes performance of operations comprising: identifying a set of attribute values from a query for accessing data within a database; determining, based on previously received queries, at least one of a joint probability for the set of attribute values or individual probabilities for the set of attribute values; and when at least one of the joint probability for the set of attribute values or an individual probability for one or more attribute values in the set of attribute values does not satisfy a probability cutoff, outputting an indication that the query is anomalous. 2 . The medium of claim 1 , wherein the operations further comprise storing a probability distribution for each attribute in a set of query attributes; and wherein determining, based on previously received queries, at least one of the joint probability for the set of attribute values or individual probabilities for the set of attribute values comprises comparing each attribute value in the set of attribute values from the query to the probability distribution for a corresponding attribute in the set of query attributes. 3 . The medium of claim 2 , wherein comparing each attribute value in the set of attribute values from the query is performed in a time that is linear to how many attributes exist in the set of attribute values and does not depend on variability of the previously received queries. 4 . The medium of claim 2 , wherein the probability distribution for each respective attribute is a histogram models that is trained from attribute values for the respective attribute extracted from the previously received queries. 5 . The medium of claim 1 , wherein the operations further comprise receiving user input indicating an acceptable false positive threshold for reporting anomalous queries; responsive to the user input, automatically adjusting the probability cutoff. 6 . The medium of claim 1 , wherein the indication includes an explanation of why the query is anomalous; wherein the explanation is generated based on a comparison between the set of attribute values from the query with probability distributions for different query attributes and highlighting a subset of the probability distributions when an attribute value is deemed anomalous. 7 . The medium of claim 1 , the operations further comprising assigning a risk level to the query based on how many of the attribute values in the set of attribute values are outliers and whether the joint probability for the set of attribute values satisfies the probability cutoff. 8 . The medium of claim 1 , wherein the set attribute values includes a first attribute value for a semantic attribute and a second attribute value for a non-semantic attribute. 9 . The medium of claim 1 , wherein the operations further comprise preventing execution of the query. 10 . A method comprising: identifying a set of attribute values from a query for accessing data within a database; determining, based on previously received queries, at least one of a joint probability for the set of attribute values or individual probabilities for the set of attribute values; and when at least one of the joint probability for the set of attribute values or an individual probability for one or more attribute values in the set of attribute values does not satisfy a probability cutoff, outputting an indication that the query is anomalous. 11 . The method of claim 10 , wherein the method further comprise storing a probability distribution for each attribute in a set of query attributes; and wherein determining, based on previously received queries, at least one of the joint probability for the set of attribute values or individual probabilities for the set of attribute values comprises comparing each attribute value in the set of attribute values from the query to the probability distribution for a corresponding attribute in the set of query attributes. 12 . The method of claim 11 , wherein comparing each attribute value in the set of attribute values from the query is performed in a time that is linear to how many attributes exist in the set of attribute values and does not depend on variability of the previously received queries. 13 . The method of claim 11 , wherein the probability distribution for each respective attribute is a histogram models that is trained from attribute values for the respective attribute extracted from the previously received queries. 14 . The method of claim 10 , wherein the method further comprise receiving user input indicating an acceptable false positive threshold for reporting anomalous queries; responsive to the user input, automatically adjusting the probability cutoff. 15 . The method of claim 10 , wherein the indication includes an explanation of why the query is anomalous; wherein the explanation is generated based on a comparison between the set of attribute values from the query with probability distributions for different query attributes and highlighting a subset of the probability distributions when an attribute value is deemed anomalous. 16 . The method of claim 10 , the method further comprising assigning a risk level to the query based on how many of the attribute values in the set of attribute values are outliers and whether the joint probability for the set of attribute values satisfies the probability cutoff. 17 . The method of claim 10 , wherein the set attribute values includes a first attribute value for a semantic attribute and a second attribute value for a non-semantic attribute. 18 . The method of claim 10 , wherein the method further comprise preventing execution of the query. 19 . A system comprising: one or more hardware processors; one or more non-transitory computer readable media storing instructions which, when executed by the one or more hardware processors, causes performance of operations comprising: identifying a set of attribute values from a query for accessing data within a database; determining, based on previously received queries, at least one of a joint probability for the set of attribute values or individual probabilities for the set of attribute values; and when at least one of the joint probability for the set of attribute values or an individual probability for one or more attribute values in the set of attribute values does not satisfy a probability cutoff, outputting an indication that the query is anomalous. 20 . The system of claim 19 , wherein the operations further comprise storing a probability distribution for each attribute in a set of query attributes; and wherein determining, based on previously received queries, at least one of the joint probability for the set of attribute values or individual probabilities for the set of attribute values comprises comparing each attribute value in the set of attribute values from the query to the probability distribution for a corresponding attribute in the set of query attributes.

Assignees

Inventors

Classifications

  • Probabilistic graphical models, e.g. probabilistic networks · CPC title

  • Protect output to user by software means · CPC title

  • to a system of files or objects, e.g. local or distributed file system or database · CPC title

  • involving long-term monitoring or reporting · CPC title

  • Machine learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2019102553A1 cover?
Techniques for detecting an anomaly in queries of a relational database are disclosed. The techniques include identifying a set of attribute values from a query for accessing data within a database. Based on previously received queries, at least one of a joint probability for the set of attribute values or individual probabilities for the set of attribute values is determined. When at least one…
Who is the assignee on this patent?
Oracle Int Corp
What technology area does this patent fall under?
Primary CPC classification G06F21/566. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Apr 04 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).