What technology area does this patent fall under?

Primary CPC classification G06N3/006. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Oct 13 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

Modular reinforcement-learning-based application manager

US10802864B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10802864-B2
Application number	US-201916261253-A
Country	US
Kind code	B2
Filing date	Jan 29, 2019
Priority date	Aug 27, 2018
Publication date	Oct 13, 2020
Grant date	Oct 13, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The current document is directed to a modular reinforcement-learning-based application manager that can be deployed in various different computational environments without extensive modification and interface development. The currently disclosed modular reinforcement-learning-based application manager interfaces to observation and action adapters and metadata that provide a uniform and, in certain implementations, self-describing external interface to the various different computational environments which the modular reinforcement-learning-based application manager may be operated to control. In addition, certain implementations of the currently disclosed modular reinforcement-learning-based application manager interface to a user-specifiable reward-generation interface to allow the rewards that provide feedback from the computational environment to the modular reinforcement-learning-based application manager to be tailored to meet a variety of different user expectations and desired control policies.

First claim

Opening claim text (preview).

The invention claimed is: 1. A modular reinforcement-learning-based application manager that manages one or more applications and a computing environment, within which the applications run, comprising one or more of a distributed computing system having multiple computer systems interconnected by one or more networks, a standalone computer system, and a processor-controlled user device, the modular reinforcement-learning based application manager comprising: a reinforcement-learning-based application manager that receives rewards and observations from the computing environment and issues actions to the computing environment in accordance with an internally maintained policy; and an interface between the computing environment and the reinforcement-learning-based application manager that transforms a manager-specific action issued by the reinforcement-learning-based application manager to a generic action and issues the generic action to the computing environment; provides, to the computing environment, generic-actions metadata for the generic actions issued to the computing environment; transforms a generic observation generated by the computing environment to an equivalent manager-specific observation and issues the manager-specific observation to the reinforcement-learning-based application manager; and provides generic-observations metadata. 2. The modular reinforcement-learning-based application manager of claim 1 wherein the manager-specific action is encoded as one or more numerical values; wherein the generic action is encoded, by the interface, in a generic-action data structure that includes data generated from the numerical values in the manager-specific action; and wherein the data in the generic-action data structure is described by the generic-actions metadata. 3. The modular reinforcement-learning-based application manager of claim 2 wherein the computing environment: uses the generic-actions metadata to transform the generic action into one or more commands that implement the generic action; and inputs the one or more commands to components and/or subsystems in the computing environment to carry out the generic action. 4. The modular reinforcement-learning-based application manager of claim 1 wherein the manager-specific observation is encoded as one or more numerical values; wherein the generic observation is encoded, by the computing environment, in a generic-observation data structure; and wherein the data in the generic-observation data structure is described by the generic-observations metadata. 5. The modular reinforcement-learning-based application manager of claim 4 wherein the computing environment: uses the generic-observations metadata to determine information sources within the computing environment, and access methods to request information from the information sources, to obtain information needed to generate the generic observation; uses the generic-observations metadata to encode the obtained information in the generic-observation data structure; and inputs the generic-observation data structure to the interface. 6. The modular reinforcement-learning-based application manager of claim 1 wherein the generic-observations metadata includes: data that indicates the number of elements in a generic-observation; a description of each element in a generic-observation; and for each data component of a generic-observation element, an indication of the data type for the data, and an indication of what the data component represents. 7. The modular reinforcement-learning-based application manager of claim 1 wherein the generic-actions metadata includes: data that indicates the number of different generic actions; and for each different generic action, an indication of the number of data components for the generic action, and for each data component, an indication of a data type, and an indication of what the data component represents. 8. The modular reinforcement-learning-based application manager of claim 1 wherein the computing environment additionally includes a reward module that generates user-specified rewards that are issued to the reinforcement-learning-based application manager. 9. The modular reinforcement-learning-based application manager of claim 8 wherein the reward module provides a user interface through which a user defines a reward function that receives observation data and action data and outputs a numeric reward. 10. The modular reinforcement-learning-based application manager of claim 1 wherein the reinforcement-learning-based application manager maintains a policy and a belief distribution, and updates the policy, during training, to generate, over time, a near-optimal or optimal policy, where a near-optimal policy is closer to an optimal policy than the policies achievable by human administrators and managers. 11. The modular reinforcement-learning-based application manager of claim 10 wherein the policy returns a next action to issue to the computing environment from the current belief distribution. 12. A method for interfacing a computing environment to a reinforcement-learning-based application manager, the method comprising: incorporating, by the computing environment comprising one or more of a distributed computing system having multiple computer systems interconnected by one or more networks, a standalone computer system, and a processor-controlled user device, an interface that interfaces the computing environment to the reinforcement-learning-based application manager; transforming, by the interface, a manager-specific actions issued by the reinforcement-learning-based application manager to corresponding generic actions and issuing the generic actions to the computing environment; providing to the computing environment, by the interface, generic-actions metadata for the generic actions issued to the computing environment; transforming, by the interface, generic observations generated by the computing environment to equivalent manager-specific observations and issuing, by the interface, the manager-specific observations to the reinforcement-learning-based application manager; and providing, by the interface, generic-observations metadata. 13. The method of claim 12 wherein the manager-specific actions are each encoded as one or more numerical values; wherein the generic actions are each encoded, by the interface, in a generic-action data structure that includes data generated from the numerical values in the manager-specific action; and wherein the data in each generic-action data structure is described by the generic-actions metadata. 14. The method of claim 13 further comprising: using, by the computing environment, the generic-actions metadata to transform each generic action into one or more commands that implement the generic action; and inputting, by the computing environment, the one or more commands corresponding to each generic action to components and/or subsystems in the computing environment to carry out the generic action. 15. The method of claim 12 wherein the manager-specific observations are each encoded as one or more numerical values; wherein the generic observations are each encoded, by the computing environment, in a generic-observation data structure; and wherein the data in each generic-observation data structure is described by the generic-observations metadata. 16. The method of claim 15 further including: using, by the computing environment, the generic-observations metadata to determine information sources within the computing environment, an

Assignees

Vmware Inc

Inventors

Classifications

G06N7/01
Probabilistic graphical models, e.g. probabilistic networks · CPC title
G06N3/092
Reinforcement learning · CPC title
G06F2009/4557
Distribution of virtual machine instances; Migration and load balancing · CPC title
G06F2009/45562
Creating, deleting, cloning virtual machine instances · CPC title
G06N3/006Primary
based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO] · CPC title

Patent family

Related publications grouped by family.

View patent family 69583547

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10802864B2 cover?: The current document is directed to a modular reinforcement-learning-based application manager that can be deployed in various different computational environments without extensive modification and interface development. The currently disclosed modular reinforcement-learning-based application manager interfaces to observation and action adapters and metadata that provide a uniform and, in cert…
Who is the assignee on this patent?: Vmware Inc
What technology area does this patent fall under?: Primary CPC classification G06N3/006. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Oct 13 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).