Method and system for devising an optimum control policy

US10884397B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10884397-B2
Application numberUS-201815944597-A
CountryUS
Kind codeB2
Filing dateApr 3, 2018
Priority dateFeb 16, 2018
Publication dateJan 5, 2021
Grant dateJan 5, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for devising an optimum control policy of a controller for controlling a system includes optimizing at least one parameter that characterizes the control policy. A Gaussian process model is used to model expected dynamics of the system. The optimization optimizes a cost function which depends on the control policy and the Gaussian process model with respect to the at least one parameter. The optimization is carried out by evaluating at least one gradient of the cost function with respect to the at least one parameter. For an evaluation of the cost function a temporal evolution of a state of the system is computed using the control policy and the Gaussian process model. The cost function depends on an evaluation of an expectation value of a cost function under a probability density of an augmented state at time steps.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for automatically tuning a multivariate PID controller for controlling a system, said method comprising: configuring the multivariate PID controller with a random control policy for controlling the system, said random control policy having at least one parameter that characterizes said random control policy; controlling the system with the multivariate PID controller based on said random control policy; devising an optimum control policy for the multivariate PID controller for controlling the system by optimizing said at least one parameter that characterizes said random control policy; using a Gaussian process model to model expected dynamics of the system, wherein said optimization optimizes a cost function which depends on said random control policy and said Gaussian process model with respect to said at least one parameter; and carrying out said optimization by evaluating at least one gradient of said cost function with respect to said at least one parameter to generate at least one optimized parameter of said optimum control policy, wherein for an evaluation of said cost function a temporal evolution of a state of the system is computed using said random control policy and said Gaussian process model, and wherein said cost function depends on an evaluation of an expectation value of a cost function under a probability density of an augmented state at time steps, tuning the multivariate PID controller by changing said at least one parameter to said at least one optimized parameter; and controlling the system with the tuned multivariate PID controller based on the optimum control policy. 2. The method according to claim 1 , wherein said augmented state at a given time step comprises the state at said given time step. 3. The method according to claim 1 , wherein said augmented state at a given time step comprises an error between the state and a desired state at a previous time step. 4. The method according to claim 1 , wherein said augmented state at a given time step comprises an accumulated error of a previous time step. 5. The method according to claim 3 , wherein the augmented state and/or the desired state are Gaussian random variables. 6. The method according to claim 1 , further comprising: devising said optimum control policy for the multivariate PID controller for controlling the system by iteratively optimizing said at least one optimized parameter that characterizes said optimum control policy, iteratively updating said Gaussian process model based on a recorded reaction of the system to said optimum control policy, using said updated Gaussian process model to model expected dynamics of the system, wherein said optimization optimizes an updated cost function which depends on said optimized control policy and said updated Gaussian process model with respect to said at least one optimized parameter, and carrying out said optimization by evaluating at least one gradient of the updated cost function with respect to said at least one optimized parameter to generate at least one further optimized parameter of the optimum control policy; and iteratively tuning the multivariate PID controller by changing said at least optimized parameter to said at least one further optimized parameter. 7. The method according claim 1 , wherein the system comprises an actuator and/or a robot. 8. The method according to claim 1 , wherein a training system for devising said optimum control policy of the multivariate PID controller is configured to carry out the method. 9. The method according to claim 1 , wherein a control system for controlling the system is configured to carry out the method. 10. The method according to claim 1 , wherein a computer program contains instructions which cause a processor to carry out the method if the computer program is executed by said processor. 11. The method according to claim 10 , wherein a machine-readable storage medium is configured to store the computer program. 12. The method according to claim 1 , wherein said at least one gradient is determined by application of the chain rule to said cost function.

Assignees

Inventors

Classifications

  • G05B13/047Primary

    the criterion being a time optimal performance criterion · CPC title

  • G05B13/04Primary

    involving the use of models or simulators · CPC title

  • for obtaining a characteristic which is both proportional and time-dependent, e.g. P. I., P. I. D. · CPC title

  • Pid learning controller, gains adapted as function of previous error · CPC title

  • characterised by program execution, i.e. part program or machine function execution, e.g. selection of a program · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10884397B2 cover?
A method for devising an optimum control policy of a controller for controlling a system includes optimizing at least one parameter that characterizes the control policy. A Gaussian process model is used to model expected dynamics of the system. The optimization optimizes a cost function which depends on the control policy and the Gaussian process model with respect to the at least one paramete…
Who is the assignee on this patent?
Bosch Gmbh Robert, Max Planck Gesellschaft
What technology area does this patent fall under?
Primary CPC classification G05B13/047. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 05 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).