Methods and apparatus for reinforcement learning
US-9679258-B2 · Jun 13, 2017 · US
US12561572B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12561572-B2 |
| Application number | US-202217906995-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 2, 2022 |
| Priority date | Apr 2, 2021 |
| Publication date | Feb 24, 2026 |
| Grant date | Feb 24, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method for calibrating parameters of a hydrology forecasting model based on a deep reinforcement learning includes selecting according to basin characteristics, and the parameters and parameter value ranges to be calibrated by the model are determined. The method includes a calibrating the parameters of the hydrology forecasting model is established, and three elements of the reinforcement learning, that is, a state space, an action space and a reward function are determined. The method includes a deep reinforcement learning method DQN is applied to optimize the parameters to be calibrated by the hydrology forecasting model. In the present disclosure, by setting a stride length of the action value for the deep reinforcement learning model, an accuracy finally optimized by the calibration parameters can be freely controlled, and a DQN algorithm is adopted to search the entire space for the calibration parameters to ensure the optimality for optimizing the calibrated parameters.
Opening claim text (preview).
What is claimed is: 1 . A method for calibrating parameters of a hydrology forecasting model based on a deep reinforcement learning, characterized by comprising following steps: Step 1, selecting a hydrology forecasting model and determining parameters that need to be calibrated, wherein the hydrology forecasting model takes a rainfall time sequence and an evaporation time sequence as inputs, and takes a time sequence of a forecasted flow as an output; Step 2, establishing a reinforcement learning model for calibrating the parameters of the hydrology forecasting model, wherein the reinforcement learning refers to a process of an interactive learning between an intelligent agent Agent and environment, and three key elements for the reinforcement learning are a state space, an action space and a reward value function; and Step 3, applying a deep reinforcement learning method DON to optimize the parameters to be calibrated by the hydrology forecasting model; wherein the process of establishing the reinforcement learning model for calibrating the parameters of the hydrology forecasting model in Step 2 comprises: 2-1) determining the state space for the reinforcement learning to obtain the parameters to be calibrated for the hydrology forecasting model, specifically: defining the state value for the reinforcement learning at a time t as a one-dimensional vector S t composed of the plurality of parameters to be calibrated by the hydrology forecasting model; s t =( w t 1 ,w t 2 , . . . ,w t N ) wherein w t i =1, 2, . . . , N are values for the parameters to be calibrated by the hydrology forecasting model at the current time t; and changes for the values w t i for the parameters at the time t have two possibilities: increase or decrease; when a magnitude of increasing or decreasing the parameters w t i is both Δ i , a value w t+1 i for the parameters at a time t+1 may be w t i +Δ i or w t i −Δ i ; 2-2) determining the action space for the reinforcement learning to obtain an action value for calibrating the parameters for the hydrology forecasting model, wherein the action value is set to control an accuracy for the parameters after calibration to control an accuracy for the hydrology forecasting model, specifically: defining the action space A for the reinforcement learning as all possibilities where each of the parameters to be calibrated changes: A = [ Δ 1 1 Δ 1 2 Δ 1 3 … Δ 1 N - 1 Δ 1 N - Δ 2 1 Δ 2 2 Δ 2 3 … Δ 2 N - 1 Δ 2 N - Δ 3 1 - Δ 3 2 Δ 3 3 … Δ 3 N - 1 Δ 3 N ⋮ ⋮ ⋮ ⋯ ⋮ ⋮ Δ 2 N - 1 1 Δ 2 N - 1 2 Δ 2 N - 1 3 ⋯ Δ
Backpropagation, e.g. using gradient descent · CPC title
based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO] · CPC title
Controlling or monitoring, e.g. of flood or hurricane; Forecasting, e.g. risk assessment or mapping · CPC title
Multi-objective optimisation, e.g. Pareto optimisation using simulated annealing [SA], ant colony algorithms or genetic algorithms [GA] · CPC title
Fluids · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.