Block-based anomaly detection in computing environments

US11797411B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11797411-B2
Application numberUS-202217696337-A
CountryUS
Kind codeB2
Filing dateMar 16, 2022
Priority dateOct 3, 2019
Publication dateOct 24, 2023
Grant dateOct 24, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An anomaly service receives log data from nodes in a computing environment, which includes a sequence of information indicative of log messages produced by the nodes. The anomaly service identifies dominant patterns in the sequence of information that are representative of non-anomalous blocks of the log messages. Having identified the dominant patterns, the service is able to extract the non-anomalous blocks from the log data to reveal anomalous blocks that do not fit the dominant patterns. The service may then generate anomaly vectors based on the anomalous blocks, which can be distributed to the nodes to detect anomalies.

First claim

Opening claim text (preview).

The invention claimed is: 1. A computing apparatus comprising: one or more computer readable storage media; one or more processors operatively coupled with the one or more computer readable storage media; and program instructions stored on the one or more computer readable storage media that, when executed by the one or more processors, direct the computing apparatus to at least: receive log data from a plurality of nodes in a computing environment, wherein the log data comprises a sequence of information indicative of log messages produced by the plurality of nodes; identify, in the sequence of information, dominant patterns representative of high-probability blocks of the log messages, wherein the dominant patterns are identified based on a frequency of their associated high-probability blocks in the log data; reveal low-probability blocks that do not fit the dominant patterns by extracting the high-probability blocks from the log data based on the dominant patterns; generate low-probability vectors based at least on the low-probability blocks; and distribute the low-probability vectors to at least one node of the plurality of nodes, wherein the at least one node detects low-probability events using the low-probability vectors. 2. The computing apparatus of claim 1 wherein to identify the dominant patterns within the sequence of information, the program instructions direct the computing apparatus to identify potential patterns within the sequence of information and select the dominant patterns from the potential patterns based at least on a scoring function applied to one or more of the potential patterns. 3. The computing apparatus of claim 2 wherein the scoring function promotes a subset of the potential patterns that occur frequently within the sequence of information relative to a different subset of the potential patterns that occur less frequently within the sequence of information. 4. The computing apparatus of claim 3 wherein the scoring function determines, for each potential pattern of the potential patterns, a relative dominance of the potential pattern based on a description length of the sequence of information when encoded with a compressed representation of the potential pattern. 5. The computing apparatus of claim 1 wherein the at least one node, to detect low-probability events using the low-probability vectors: generates hash values based on log messages produced at the at least one node; generates a sequence vector based on the hash values; compares the sequence vector to the low-probability vectors; and determines whether the sequence vector matches one or more of the low-probability vectors indicating a low-probability event. 6. The computing apparatus of claim 5 wherein the at least one node, to determine whether the sequence vector matches one or more of the low-probability vectors, employs a similarity function that determines whether the sequence vector is a sufficient match to one or more of the low-probability vectors. 7. The computing apparatus of claim 1 wherein the at least one node predicts low-probability events using the low-probability vectors. 8. One or more computer-readable storage media having program instructions stored thereon, wherein the program instructions, when read and executed by a processing system, direct the processing system to at least: receive log data from a plurality of nodes in a computing environment, wherein the log data comprises a sequence of information indicative of log messages produced by the plurality of nodes; identify, in the sequence of information, dominant patterns representative of high-probability blocks of the log messages, wherein the dominant patterns are identified based on a frequency of their associated high-probability blocks in the log data; reveal low-probability blocks that do not fit the dominant patterns by extracting the high-probability blocks from the log data based on the dominant patterns; generate low-probability vectors based at least on the low-probability blocks; and distribute the low-probability vectors to at least one node of the plurality of nodes wherein the at least one node detects low-probability events using the low-probability vectors. 9. The one or more computer-readable storage media of claim 8 wherein to identify the dominant patterns within the sequence of information, the program instructions, when executed by the processing system, direct the processing system to identify potential patterns within the sequence of information and select the dominant patterns from the potential patterns based at least on a scoring function applied to one or more of the potential patterns. 10. The one or more computer-readable storage media of claim 9 wherein the scoring function promotes a subset of the potential patterns that occur frequently within the sequence of information relative to a different subset of the potential patterns that occur less frequently within the sequence of information. 11. The one or more computer-readable storage media of claim 10 wherein the scoring function determines, for each potential pattern of the potential patterns, a relative dominance of the potential pattern based on a description length of the sequence of information when encoded with a compressed representation of the potential pattern. 12. The one or more computer-readable storage media of claim 8 wherein the at least one node, to detect low-probability events using the low-probability vectors: generates hash values based on log messages produced at the at least one node; generates a sequence vector based on the hash values; compares the sequence vector to the low-probability vectors; and determines whether the sequence vector matches one or more of the low-probability vectors indicating a low-probability event. 13. The one or more computer-readable storage media of claim 12 wherein the at least one node, to determine whether the sequence vector matches one or more of the low-probability vectors, employs a similarity function that determines whether the sequence vector is a sufficient match to one or more of the low-probability vectors. 14. The one or more computer-readable storage media of claim 8 wherein the at least one node predicts low-probability events using the low-probability vectors. 15. A method comprising: receiving log data from a plurality of nodes in a computing environment, wherein the log data comprises a sequence of information indicative of log messages produced by the plurality of nodes; identifying, in the sequence of information, dominant patterns representative of high-probability blocks of the log messages, wherein the dominant patterns are identified based on a frequency of their associated high-probability blocks in the log data; revealing low-probability blocks that do not fit the dominant patterns by extracting the high-probability blocks from the log data based on the dominant patterns; generating low-probability vectors based at least on the low-probability blocks; and distributing the low-probability vectors to at least one node of the plurality of nodes, wherein the at least one node detects low-probability events using the low-probability vectors. 16. The method of claim 15 wherein identifying the dominant patterns within the sequence of information comprises identifying potential patterns within the sequence of information and selecting the dominant patterns from the potential patterns based at least on a scoring function applied to one or more of the potential patterns. 17. The method of claim 16 wherein the sco

Assignees

Inventors

Classifications

  • the data filtering being achieved in order to maintain consistency among the monitored data, e.g. ensuring that the monitored data belong to the same timeframe, to the same system or component · CPC title

  • in a distributed system consisting of a plurality of standalone computer nodes, e.g. clusters, client-server systems · CPC title

  • Error or fault detection not based on redundancy (power supply failures G06F1/30; network fault management H04L41/06) · CPC title

  • Remedial or corrective actions (recovery from an exception in an instruction pipeline G06F9/3861; by retry G06F11/1402; for recovering from a failure of a protocol instance or entity H04L69/40) · CPC title

  • Data logging (G06F11/14, G06F11/2205 take precedence) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11797411B2 cover?
An anomaly service receives log data from nodes in a computing environment, which includes a sequence of information indicative of log messages produced by the nodes. The anomaly service identifies dominant patterns in the sequence of information that are representative of non-anomalous blocks of the log messages. Having identified the dominant patterns, the service is able to extract the non-a…
Who is the assignee on this patent?
Oracle Int Corp
What technology area does this patent fall under?
Primary CPC classification G06F11/3075. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 24 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).