Method and a system for applying machine learning to an application

US12547928B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12547928-B2
Application numberUS-202117317926-A
CountryUS
Kind codeB2
Filing dateMay 12, 2021
Priority dateNov 13, 2018
Publication dateFeb 10, 2026
Grant dateFeb 10, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for applying machine learning to an application includes: a) generating a candidate policy by a learner; b) executing a program in at least one simulated application based on a set of candidate parameters provided based on the candidate policy and a state of the at least one simulated application, execution of the program providing interim results of tested sets of candidate parameters based on a measured performance information of the execution of the program; c) collecting a predetermined number of interim results and providing an end result based on a combination of the candidate parameters and/or the state with the measured performances information by a trainer; and d) generating a new candidate policy by the learner based on the end result.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method for applying machine learning to an application, comprising: a) generating a candidate policy by a learner, wherein the candidate policy describes a mapping of states to a set of candidate parameters; b) assigning the set of candidate parameters to at least one simulated application based on the candidate policy and the states by at least one manager; c) executing a program in the at least one simulated application based on a set of candidate parameters provided based on the candidate policy and a state of the at least one simulated application, execution of the program providing interim results of tested sets of candidate parameters based on a measured performance information of the execution of the program, wherein the set of candidate parameters are requested by the at least one simulated application together with providing the state of the at least one simulated application, wherein the program is executed simultaneously in the at least one simulated application and in at least one real application comprising a real robot application, wherein reality data of the machine is acquired while executing the program in the at least one real application, and wherein the at least one simulated application is modified based on the reality data, wherein the reality data is acquired using a sensor placed in a real-world environment; d) collecting a predetermined number of interim results and providing an end result based on a combination of the candidate parameters and the states with the measured performances information by a trainer; and e) generating a new candidate policy by the learner based on the end result using machine learning algorithms. 2 . The method of claim 1 , wherein the program is executed in cooperating real applications. 3 . The method of claim 2 , wherein the candidate policy defines task assignments to each of the cooperating real applications. 4 . The method of claim 1 , wherein a) to d) are repeated until a stop criterion is met. 5 . The method of claim 4 , wherein the stop criterion comprises an amount of executions of the program and a target measured performance. 6 . The method of claim 1 , wherein the set of candidate parameters has a parameter range, and wherein the set of candidate parameters executed on the at least one simulated application has a wider parameter range than the set of candidate parameters executed on the at least one real application. 7 . The method of claim 1 , further comprising: assigning the set of candidate parameters to the at least one simulated application and the at least one real application based on the candidate policy and the states by at least one manager. 8 . The method of claim 7 , further comprising: requesting the set of candidate parameters by the at least one simulated application and the at least one real application; and receiving the interim results of tested sets of candidate parameters based on a measured performance of the execution of the program by the manager. 9 . The method of claim 1 , further comprising: receiving the predetermined number of interim results by a trainer, wherein the trainer triggers generation of new policies by the learner. 10 . The method of claim 1 , further comprising obtaining a set of final parameters. 11 . A system for applying machine learning to an application, comprising: a learner configured to generate learning policies, wherein the candidate policy describes a mapping of states to a set of candidate parameters; at least one real application; at least one simulated application; a manager, configured to assign the set of candidate parameters to the at least one simulated application based on the candidate policy and the states; a program configured to be executed, by a processor, in the at least one simulated application based on a set of candidate parameters provided based on the candidate policy and a state of the at least one simulated application, the program being configured to provide interim results of tested sets of candidate parameters based on a measured performance information of the execution of the program, wherein the set of candidate parameters are requested by the at least one simulated application together with providing the state of the at least one simulated application, wherein the program is configured to be executed simultaneously in the at least one simulated application and in the at least one real application based on the set of candidate parameters, wherein the at least one real application comprising a real robot application; a trainer configured to collect a predetermined number of interim results, the trainer being configured to provide an end result based on a combination of the candidate parameters and the states with the measured performance information, wherein the learner is configured to generate a new candidate policy based on the end result using machine learning algorithms; and a sensor configured to acquire reality data while the program is executed in the at least one real application, wherein the sensor is placed in a real-world environment, and wherein the at least one simulated application is configured to be modified based on the reality data. 12 . The system of claim 11 , wherein the set of candidate parameters has a parameter range, and wherein the set of candidate parameters executed on the at least one simulated application has a wider parameter range than the set of candidate parameters executed on the at least one real application. 13 . The system of claim 11 , further comprising: a manager configured to assign the set of candidate parameters to the at least one simulated application and the at least one real application based on the candidate policy and the states. 14 . A tangible, non-transitory computer readable medium for applying machine learning to an application, the computer readable medium having instructions thereon, which, upon being executed by the one or more processors, provides for execution of the following steps: a) generating a candidate policy, by a learner, wherein the candidate policy describes a mapping of states to a set of candidate parameters; b) assigning the set of candidate parameters to at least one simulated application based on the candidate policy and the states by at least one manager; c) executing a program in the at least one simulated application based on a set of candidate parameters provided based on the candidate policy and a state of the at least one simulated application, the execution of the program providing interim results of tested sets of candidate parameters based on a measured performance information of the execution of the program, wherein the set of candidate parameters are requested by the at least one simulated application together with providing the state of the at least one simulated application, wherein the program is executed simultaneously in the at least one simulated application and in at least one real robot application, wherein reality data of the machine is acquired while executing the program in the at least one real application, and wherein the at least one simulated application is modified based on the reality data, wherein the reality data is acquired using a sensor placed in a real-world environment; d) collecting a predetermined number of interim results and providing an end result based on a combination of the candidate parameters and the states with the measured performances information by a trainer; and e) generating a new candidate policy by the learner based on the end result using machine learning algorithms.

Assignees

Inventors

Classifications

  • learning, adaptive, model based, rule based expert control · CPC title

  • Design optimisation, verification or simulation (optimisation, verification or simulation of circuit designs G06F30/30) · CPC title

  • based on physical entities controlled by simulated intelligence so as to replicate intelligent life forms, e.g. based on robots replicating pets or humans in their appearance or behaviour · CPC title

  • Evolutionary algorithms, e.g. genetic algorithms or genetic programming · CPC title

  • G06N20/00Primary

    Machine learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12547928B2 cover?
A method for applying machine learning to an application includes: a) generating a candidate policy by a learner; b) executing a program in at least one simulated application based on a set of candidate parameters provided based on the candidate policy and a state of the at least one simulated application, execution of the program providing interim results of tested sets of candidate parameters…
Who is the assignee on this patent?
Abb Schweiz Ag
What technology area does this patent fall under?
Primary CPC classification G06N20/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 10 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).