Abnormal behavior detection of enterprise entities using time-series data
US-2018176241-A1 · Jun 21, 2018 · US
US10705940B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10705940-B2 |
| Application number | US-201816145536-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 28, 2018 |
| Priority date | Sep 28, 2018 |
| Publication date | Jul 7, 2020 |
| Grant date | Jul 7, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques are provided for system operational analytics using normalized likelihood scores. In one embodiment, an exemplary method comprises: obtaining data from data sources associated with a monitored system; applying at least one function to the log data to obtain a plurality of time-series counters for a plurality of distinct features within the data; processing the plurality of time-series counters using at least one machine learning model to obtain a plurality of log likelihood values representing a behavior of the monitored system over time; determining a z-score for each of the plurality of log likelihood values over a predefined short-term time window; monitoring a distribution of the z-scores for the plurality of log likelihood values over a predefined long-term time window to map the z-scores to percentile values; and mapping the percentile values to a health score for the monitored system based on predefined percentile ranges and/or a transformation function.
Opening claim text (preview).
What is claimed is: 1. A method, comprising: obtaining data from one or more data sources associated with a monitored system; applying at least one function to the data to obtain a plurality of time-series counters for a plurality of distinct features within the data; processing, using at least one processing device, the plurality of time-series counters using at least one machine learning model to obtain a plurality of log likelihood values representing a behavior of the monitored system over time; determining, using the at least one processing device, a z-score for each of the plurality of log likelihood values over a predefined short-term time window; monitoring, using the at least one processing device, a distribution of the z-scores for the plurality of log likelihood values over a predefined long-term time window to map the z-scores to percentile values; and mapping, using the at least one processing device, the percentile values to a health score for the monitored system based on one or more of predefined percentile ranges and at least one transformation function. 2. The method of claim 1 , wherein the plurality of log likelihood values represents a probability of a specific behavior to occur out of a probability distribution behavior of the monitored system. 3. The method of claim 1 , wherein the z-scores for the plurality of log likelihood values over the predefined short-term time window determine a distance of a given log likelihood value from a mean of the plurality of log likelihood values divided by a standard deviation of the plurality of log likelihood values. 4. The method of claim 1 , wherein the z-scores for the plurality of log likelihood values over the predefined short-term time window determine a distance of a given log likelihood value from a median of the plurality of log likelihood values. 5. The method of claim 1 , wherein the z-scores for the plurality of log likelihood values over the predefined short-term time window evaluate a deviation of a given log likelihood value from a learned normal behavior of the monitored system during the predefined short-term time window. 6. The method of claim 1 , wherein the step of mapping the percentile values to the health score for the monitored system further comprises sorting the z-scores for the plurality of log likelihood values and selecting a percentage of the z-scores as corresponding to anomalous behavior based on the predefined percentile ranges. 7. The method of claim 1 , wherein one or more of the predefined short-term time window, the predefined long-term time window, and the predefined percentile ranges comprise one or more of a configured value, a policy-based value, and a default value. 8. A system, comprising: a memory; and at least one processing device, coupled to the memory, operative to implement the following steps: obtaining data from one or more data sources associated with a monitored system; applying at least one function to the data to obtain a plurality of time-series counters for a plurality of distinct features within the data; processing the plurality of time-series counters using at least one machine learning model to obtain a plurality of log likelihood values representing a behavior of the monitored system over time; determining a z-score for each of the plurality of log likelihood values over a predefined short-term time window; monitoring a distribution of the z-scores for the plurality of log likelihood values over a predefined long-term time window to map the z-scores to percentile values; and mapping the percentile values to a health score for the monitored system based on one or more of predefined percentile ranges and at least one transformation function. 9. The system of claim 8 , wherein the plurality of log likelihood values represents a probability of a specific behavior to occur out of a probability distribution behavior of the monitored system. 10. The system of claim 8 , wherein the z-scores for the plurality of log likelihood values over the predefined short-term time window determine a distance of a given log likelihood value from a mean of the plurality of log likelihood values divided by a standard deviation of the plurality of log likelihood values. 11. The system of claim 8 , wherein the z-scores for the plurality of log likelihood values over the predefined short-term time window determine a distance of a given log likelihood value from a median of the plurality of log likelihood values. 12. The system of claim 8 , wherein the z-scores for the plurality of log likelihood values over the predefined short-term time window evaluate a deviation of a given log likelihood value from a learned normal behavior of the monitored system during the predefined short-term time window. 13. The system of claim 8 , wherein the step of mapping the percentile values to the health score for the monitored system further comprises sorting the z-scores for the plurality of log likelihood values and selecting a percentage of the z-scores as corresponding to anomalous behavior based on the predefined percentile ranges. 14. The system of claim 8 , wherein one or more of the predefined short-term time window, the predefined long-term time window, and the predefined percentile ranges comprise one or more of a configured value, a policy-based value, and a default value. 15. A computer program product, comprising a non-transitory machine-readable storage medium having encoded therein executable code of one or more software programs, wherein the one or more software programs when executed by at least one processing device perform the following steps: obtaining data from one or more data sources associated with a monitored system; applying at least one function to the data to obtain a plurality of time-series counters for a plurality of distinct features within the data; processing the plurality of time-series counters using at least one machine learning model to obtain a plurality of log likelihood values representing a behavior of the monitored system over time; determining a z-score for each of the plurality of log likelihood values over a predefined short-term time window; monitoring a distribution of the z-scores for the plurality of log likelihood values over a predefined long-term time window to map the z-scores to percentile values; and mapping the percentile values to a health score for the monitored system based on one or more of predefined percentile ranges and at least one transformation function. 16. The computer program product of claim 15 , wherein the plurality of log likelihood values represents a probability of a specific behavior to occur out of a probability distribution behavior of the monitored system. 17. The computer program product of claim 15 , wherein the z-scores for the plurality of log likelihood values over the predefined short-term time window determine a distance of a given log likelihood value from a mean of the plurality of log likelihood values divided by a standard deviation of the plurality of log likelihood values. 18. The computer program product of claim 15 , wherein the z-scores for the plurality of log likelihood values over the predefined short-term time window determine a distance of a given log likelihood value from a median of the plurality of log likelihood values. 19. The computer program product of claim 15 , wherein the z-scores for the plurality of log likelihood values over the predefined short-term time window evaluate a deviation of a given log likelihood value from a learned normal
Data logging (G06F11/14, G06F11/2205 take precedence) · CPC title
by assessing time · CPC title
Performance evaluation by statistical analysis · CPC title
Error or fault detection not based on redundancy (power supply failures G06F1/30; network fault management H04L41/06) · CPC title
Monitoring involving counting · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.