What technology area does this patent fall under?

Primary CPC classification H04L43/08. Mapped technology areas include Electricity.

When was this patent published?

Publication date Tue Jul 04 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Predictive model for anomaly detection and feedback-based scheduling

US9699049B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9699049-B2
Application number	US-201414586381-A
Country	US
Kind code	B2
Filing date	Dec 30, 2014
Priority date	Sep 23, 2014
Publication date	Jul 4, 2017
Grant date	Jul 4, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In an example embodiment, clusters of nodes in a network are monitored. Then the monitored data may be stored in an open time-series database. Data from the open time-series database is collected and labeled it as training data. Then a model is built through machine learning using the training data. Additional data is retrieved from the open time-series database. The additional data is left as unlabeled. Anomalies in the unlabeled data are computed using the model, producing prediction outcomes and metrics. Finally, the prediction outcomes and the network.

First claim

Opening claim text (preview).

What is claimed is: 1. A system comprising: an open time-series database; a scheduler; a monitoring agent executable by one or more processors and configured to monitor clusters of nodes in a network and store monitored data in the open time-series database; an offline training module comprising: a data collection and preprocessing module configured to collect first data from the open time-series database and to label the first data as training data; and a machine learning model building module configured to build a model through machine learning using the training data; a real-time testing module comprising: a data collection and preprocessing module configured to collect second data from the open time-series database and to leave the second data as unlabeled; and a predictive model engine configured to compute anomalies in the unlabeled data using the model built by the machine learning model and to output prediction outcomes and metrics to the scheduler; and the scheduler configured to use the prediction outcomes and metrics to move or reduce workloads from problematic clusters of nodes in the network. 2. The system of claim 1 , wherein the scheduler comprises an extension to a YARN scheduler. 3. The system of claim 2 , wherein the extension comprises: a scheduler feedback language parser configured to parse feedback information written in a scheduler feedback language. 4. The system of claim 2 , wherein the extension comprises: a feedback agent configured to interact with the predictive model engine to receive feedback information. 5. The system of claim 2 , wherein the extension comprises: a feedback policy module configured to take scheduling rules and generate an execution plan based on the scheduling rules and feedback from a feedback agent. 6. The system of claim 2 , wherein the extension comprises: an action executor configured to execute a scheduling created based on feedback from a feedback agent. 7. The system of claim 1 , wherein the scheduler is contained in a resource manager. 8. A method comprising: monitoring clusters of nodes in a network; storing monitored data in an open time-series database; collecting data from the open time-series database and labeling it as training data; building a model through machine learning using the training data; collecting additional data from the open time-series database; leaving the additional data as unlabeled; compute anomalies in the unlabeled data using the model, producing prediction outcomes and metrics; and using the prediction outcomes and metrics to move or reduce workloads from problematic clusters of nodes in the network. 9. The method of claim 8 , wherein the computing anomalies includes building a model using a trading data set using Multivariate Gaussian Distribution. 10. The method of claim 8 , wherein the computing anomalies includes applying a Matthews Correlation coefficient as a threshold to reduce false positives. 11. The method of claim 8 , wherein the computing anomalies includes applying a half total error rate as a threshold to reduce false positives. 12. The method of claim 8 , wherein the computing anomalies includes defining a function to calculate an anomaly score of data nodes. 13. The method of claim 8 , wherein the using the prediction outcomes includes: detecting that a data node is anomalous; in response to the detection that the data node is anomalous, locating one or more features contributing to the anomaly. 14. The method of claim 13 , wherein the locating includes deducing one or more features contributing to the anomaly using a single-variate Gaussian Distribution Function. 15. A non-transitory machine-readable storage medium embodying instructions which, when executed by a machine, cause the machine to execute operations comprising: monitoring clusters of nodes in a network; storing monitored data in an open time-series database; collecting data from the open time-series database and labeling it as training data; building a model through machine learning using the training data; collecting additional data from the open time-series database; leaving the additional data as unlabeled; computing anomalies in the unlabeled data using the model, producing prediction outcomes and metrics; and using the prediction outcomes and metrics to move or reduce workloads from problematic clusters of nodes in the network. 16. The non-transitory machine-readable storage medium of claim 15 , wherein the computing anomalies includes building a model using a trading data set using Multivariate Gaussian Distribution. 17. The non-transitory machine-readable storage medium of claim 15 , wherein the computing anomalies includes applying a Matthews Correlation coefficiant as a threshold to reduce false positives. 18. The non-transitory machine-readable storage medium of claim 15 , wherein the computing anomalies includes applying a half total error rate as a threshold to reduce false positives. 19. The non-transitory machine-readable storage medium of claim 15 , wherein the computing anomalies includes defining a function to calculate an anomaly score of data nodes. 20. The non-transitory machine-readable storage medium of claim 15 , wherein the using the prediction outcomes includes: detecting that a data node is anomalous; in response to the detection that the data node is anomalous, locating one or more features contributing to the anomaly.

Assignees

Ebay Inc

Inventors

Classifications

H04L41/149
for prediction of maintenance · CPC title
H04L41/122
of virtualised topologies, e.g. software-defined networks [SDN] or network function virtualisation [NFV] · CPC title
H04L41/0896
Bandwidth or capacity management, i.e. automatically increasing or decreasing capacities (flow or congestion control using dynamic resource allocation, e.g. in-call renegotiation, H04L47/76) · CPC title
H04L43/08Primary
Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters · CPC title
G06N99/005
Physics · mapped topic

Patent family

Related publications grouped by family.

View patent family 55526882

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9699049B2 cover?: In an example embodiment, clusters of nodes in a network are monitored. Then the monitored data may be stored in an open time-series database. Data from the open time-series database is collected and labeled it as training data. Then a model is built through machine learning using the training data. Additional data is retrieved from the open time-series database. The additional data is left as …
Who is the assignee on this patent?: Ebay Inc
What technology area does this patent fall under?: Primary CPC classification H04L43/08. Mapped technology areas include Electricity.
When was this patent published?: Publication date Tue Jul 04 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Methods and apparatus for predictive capacity allocation

Workload optimization, scheduling, and placement for rack-scale architecture computing systems

Scheduling predictive models for machine learning systems

Predicting route utilization and non-redundant failures in network environments

Frequently asked questions