Computationally efficient reinforcement-learning-based application manager

US10949263B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10949263-B2
Application numberUS-201916518717-A
CountryUS
Kind codeB2
Filing dateJul 22, 2019
Priority dateAug 27, 2018
Publication dateMar 16, 2021
Grant dateMar 16, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The current document is directed to automated reinforcement-learning-based application managers that obtain increased computational efficiency by reusing learned models and by using human-management experience to truncate state and observation vectors. Learned models of managed environments that receive component-associated inputs can be partially or completely reused for similar environments. Human managers and administrators generally use only a subset of the available metrics in managing an application, and that subset can be used as an initial subset of metrics for learning an optimal or near-optimal control policy by an automated reinforcement-learning-based application manager.

First claim

Opening claim text (preview).

The invention claimed is: 1. An automated reinforcement-learning-based application manager that manages a computing environment that includes one or more applications and one or more of a distributed computing system having multiple computer systems interconnected by one or more networks, a standalone computer system, and a processor-controlled user device, the reinforcement-learning based application manager comprising: one or more processors, one or more memories, and one or more communications subsystems; a set of actions that can be issued to the computing environment; a set of metrics; an iterative control process that repeatedly selects and issues a next action, according to a control policy that uses a dimensionally reduced computational-environment vector that includes metric values and that represents a current state of the computational environment obtained by projecting a computational-environment vector into a vector subspace defined by a subset of metrics used by human application managers to control distributed applications, and receives a reward and one of an observation and a metric vector from the computing environment, in response to execution of the issued next action, which the control process projects into the vector subspace and uses to attempt to learn an optimal or near-optimal control policy, over time. 2. The automated reinforcement-learning-based application manager of claim 1 wherein each action is represented by a set of values; and wherein representations of actions are each translated into one or more commands that are directed to computational-entity components of the computing environment. 3. The automated reinforcement-learning-based application manager of claim 2 wherein computational-environment vectors represent states and/or observations and include metric values. 4. The automated reinforcement-learning-based application manager of claim 3 wherein the subset of metrics used by human application managers is specified to the automated reinforcement-learning-based application manager via one or more of: a manager interface; a configuration files; and a hard coding. 5. The automated reinforcement-learning-based application manager of claim 1 projecting a computational-environment vector into a vector subspace defined by a subset of metrics used by human application managers to control distributed applications further comprises removing metric values from the computational-environment vector that do not correspond to metrics in the metric subset used by human application managers to control distributed applications. 6. The automated reinforcement-learning-based application manager of claim 1 wherein the model includes: a set of computing-environment-component inputs; a set of computing-environment-component outputs; and internal components that each includes learned information and that transform the information input to the model through the set of computing-environment-component inputs to outputs from the model through the set of computing-environment-component outputs, each internal component related one of a set of computing-environment-component-input ancestors and a subset of the set of computing-environment-component-input descendants. 7. The automated reinforcement-learning-based application manager of claim 6 wherein the learned information, learned by the different automated reinforcement-learning-based application manager, included in those internal components with identical or equivalent computing-environment-component-input ancestors or computing-environment-component-input descendants in the computing environment and the different computing environment is reused by the automated reinforcement-learning-based application manager. 8. The automated reinforcement-learning-based application manager of claim 7 wherein the computing-environment-component inputs and the computing-environment-component outputs also included stored information that may be reused for those computing-environment-component inputs and the computing-environment-component outputs associated with identical or equivalent components in the computing-environment-component-input descendants in the computing environment and the different computing environment. 9. The method of claim 7 wherein the computing-environment-component inputs and the computing-environment-component outputs also included stored information that may be reused for those computing-environment-component inputs and the computing-environment-component outputs associated with identical or equivalent components in the computing-environment-component-input descendants in the computing environment and the different computing environment. 10. The automated reinforcement-learning-based application manager of claim 6 wherein the model is a neural network; wherein the computing-environment-component inputs are input nodes; wherein the computing-environment-component outputs are output nodes; wherein the internal components are hidden nodes; and wherein the learned information is weights associated with nodes of the neural network. 11. The method of claim 6 wherein the model is a neural network; wherein the computing-environment-component inputs are input nodes; wherein the computing-environment-component outputs are output nodes; wherein the internal components are hidden nodes; and wherein the learned information is weights associated with nodes of the neural network. 12. An automated reinforcement-learning-based application manager that manages a computing environment that includes one or more applications and one or more of a distributed computing system having multiple computer systems interconnected by one or more networks, a standalone computer system, and a processor-controlled user device, the reinforcement-learning based application manager comprising: one or more processors, one or more memories, and one or more communications subsystems; a set of actions that can be issued to the computing environment; a set of metrics; a model partly or completely learned during operation of a different automated reinforcement-learning-based application manager that controls a different computing environment; and an iterative control process that repeatedly selects and issues a next action, according to a control policy that uses a computational-environment vector that includes metric values and that represents a current state of the computational environment, and receives a reward and one of an observation and a metric vector from the computing environment, in response to execution of the issued next action, which the control process uses to attempt to learn an optimal or near-optimal control policy, over time, by improving the model. 13. The automated reinforcement-learning-based application manager of claim 12 wherein each action is represented by a set of values; and wherein representations of actions are each translated into one or more commands that are directed to computational-entity components of the computing environment. 14. The automated reinforcement-learning-based application manager of claim 13 wherein computational-environment vectors represent states and/or observations and include metric values. 15. A method that improves the computational efficiency of an automated reinforcement-learning-based application manager having one or more processors, one or more memories, one or more communications subsystems, a set of actions that can be issued by an iterative control process to a computing environment, controlled by the automated reinforcement-learning-based application manager,

Assignees

Inventors

Classifications

  • Probabilistic graphical models, e.g. probabilistic networks · CPC title

  • Activation functions · CPC title

  • based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO] · CPC title

  • Machine learning · CPC title

  • G06F9/5077Primary

    Logical partitioning of resources; Management or configuration of virtualized resources (specific details on emulation or internal functioning of virtual machines G06F9/455) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10949263B2 cover?
The current document is directed to automated reinforcement-learning-based application managers that obtain increased computational efficiency by reusing learned models and by using human-management experience to truncate state and observation vectors. Learned models of managed environments that receive component-associated inputs can be partially or completely reused for similar environments. …
Who is the assignee on this patent?
Vmware Inc
What technology area does this patent fall under?
Primary CPC classification G06F9/5077. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 16 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).