Apparatus and method of ensuring quality of control operations of system on the basis of reinforcement learning
US-2020167611-A1 · May 28, 2020 · US
US11645728B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11645728-B2 |
| Application number | US-202117513757-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 28, 2021 |
| Priority date | Oct 29, 2020 |
| Publication date | May 9, 2023 |
| Grant date | May 9, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Disclosed is a method for controlling an energy management system that is performed by a computing device including at least one processor. The method may include acquiring a target temperature of one or more target points; and controlling one or more control variables using a reinforcement learning control model trained for a first condition regarding a state before a current temperature of the target points converges to the target temperature.
Opening claim text (preview).
The invention claimed is: 1. A method for controlling an energy management system (EMS) that is performed by a computing device including at least one processor, the method comprising: acquiring a target temperature of one or more target points; controlling one or more control variables using a reinforcement learning control model trained for a first condition regarding a state before a current temperature of the target points converges to the target temperature; controlling the one or more control variables using the reinforcement learning control model trained for a second condition regarding a state after the current temperature of the target points converges to the target temperature; and acquiring a target indirect indicator corresponding to the acquired target temperature, wherein the reinforcement learning control model is trained based on a reward that is calculated differently for the first condition and the second condition respectively, wherein the target indirect indicator includes a value obtained through at least one sensor from the environment, when the current temperature of the target points converges towards the target temperature, based on the reinforcement learning control model controlling one or more control variables, the reinforcement learning control model being trained to control one or more control variables based on a state information, wherein a training method of the reinforcement learning control model for the second condition regarding state after the current temperature of the target points converges to the target temperature includes: training a first control agent comprised in the reinforcement learning control model, based on a reward computed based on the current temperature of the target points, the target temperature, and total amount of work; and training a second control agent comprised in the reinforcement learning control model, based on a reward computed based on a current indirect indicator and the target indirect indicator. 2. The method for controlling EMS of claim 1 , wherein the reinforcement learning control model comprises: a first control agent trained for controlling a first control variable; and a second control agent trained for controlling a second control variable. 3. The method for controlling EMS of claim 2 , wherein the first control variable and the second control variable are dependent on each other, wherein the first control variable is an output of a compressor, and the second control variable is a degree of opening and closing of a valve, and wherein the reinforcement learning control model separately controls the output of the compressor and the degree of opening and closing of the valve. 4. The method for controlling EMS of claim 1 , wherein the reinforcement learning control model includes an artificial neural network layer including at least one node, and wherein a training method of the reinforcement leaning control model comprises: acquiring the state information from an environment including at least one sensor, by the reinforcement learning control model; controlling the one or more control variables based on the state information, by the reinforcement learning control model; acquiring the state information updated from the environment as a result of controlling the control variables, by the reinforcement learning control model; and training the reinforcement learning control model based on the acquired reward from the environment as the result of controlling the control variables. 5. The method for controlling the EMS of claim 4 , wherein the reward comprises at least one of the followings: a reward computed based on the current temperature of the target points and the target temperature; a reward computed based on total amount of work; or a reward computed based on a current indirect indicator and the target indirect indicator. 6. The method for controlling the EMS of claim 1 , wherein state information that the reinforcement learning control model acquires from the environment is first state information that includes at least one of state data on temperature, state data on an output of a compressor, and state data on a degree of opening and closing of a valve. 7. The method for controlling the EMS of claim 4 , wherein the training the reinforcement learning control model based on the acquired reward from the environment as the result of controlling the control variables, in the first condition, comprises: training a first control agent comprised in the reinforcement learning control model, based on a reward computed based on the current temperature of the target points and the target temperature; and training a second control agent comprised in the reinforcement learning control model, based on a reward computed based on total amount of work. 8. The method for controlling the EMS of claim 4 , wherein the training the reinforcement learning control model based on the acquired reward from the environment as the result of controlling the control variables, in the second condition, comprises: training a first control agent comprised in the reinforcement learning control model, based on the current temperature of the target points, the target temperature, and total amount of work; and training a second control agent comprised in the reinforcement learning control model, based on a reward computed based on the total amount of work. 9. The method for controlling the EMS of claim 1 , wherein the target indirect indicator includes a pre-determined value according to the target temperature. 10. The method for controlling the EMS of claim 1 , wherein state information that the reinforcement learning control model acquires from the environment is: second state information additionally including state data for an indirect indicator to first state information that includes at least one of state data on temperature, state data on an output of a compressor, and state data on a degree of opening and closing of a valve. 11. A non-transitory computer readable storage medium, wherein when the non-transitory computer readable storage medium is executed in one or more processors, the non-transitory computer readable storage medium causes the following operations to be performed for controlling an energy management system, the operations comprising: acquiring a target temperature of one or more target points; controlling one or more control variables using a reinforcement learning control model trained for a first condition regarding a state before a current temperature of the target points converges to the target temperature; controlling the one or more control variables using the reinforcement learning control model trained for a second condition regarding a state after the current temperature of the target points converges to the target temperature; and acquiring a target indirect indicator corresponding to the acquired target temperature, wherein the reinforcement learning control model is trained based on a reward that is calculated differently for the first condition and the second condition respectively, and wherein the target indirect indicator includes a value obtained through at least one sensor from the environment, when the current temperature of the target points converges towards the target temperature, based on the reinforcement learning control model controlling one or more control variables, the reinforcement learning control model being trained to control one or more control variables based on a state information, and wherein a training method of the reinforcement learning control model for the second condition regarding state after the current temperature of the target points converges
Energy or water supply · CPC title
Monitoring arrangements for monitoring environmental properties or parameters of the computing system or of the computing system component, e.g. monitoring of power, currents, temperature, humidity, position, vibrations (thermal management in cooling arrangements of a computing system G06F1/206) · CPC title
the criterion being a learning criterion · CPC title
HVAC, heating, ventillation, climate control · CPC title
electric · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.