Method for the computer-aided learning of a recurrent neural network for modeling a dynamic system

US9235800B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9235800-B2
Application numberUS-201113640543-A
CountryUS
Kind codeB2
Filing dateApr 12, 2011
Priority dateApr 14, 2010
Publication dateJan 12, 2016
Grant dateJan 12, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for the computer-aided learning of a recurrent neural network for modeling a dynamic system which is characterized at respective times by an observable vector with one or more observables as entries is provided. The neural network includes both a causal network with a flow of information that is directed forwards in time and a retro-causal network with a flow of information which is directed backwards in time. The states of the dynamic system are characterized by first state vectors in the causal network and by second state vectors in the retro-causal network, wherein the state vectors each contain observables for the dynamic system and also hidden states of the dynamic system. Both networks are linked to one another by a combination of the observables from the relevant first and second state vectors and are learned on the basis of training date including known observables vectors.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method for computer-aided learning of a recurrent neural network for modeling a dynamic system which is characterized at respective times by an observable vector comprising one or more observables as entries, the method comprising: providing a recurrent neural network comprising a causal network, a retro-causal network, an observable vector with one or more observables, wherein the causal network describes an information flow proceeding forward in time between first state vectors of the dynamic system, wherein a first state vector at a respective point in time comprises one or more first entries which are each assigned to an entry of the observable vector, and one or more hidden states of the dynamic system, wherein the retro-causal network describes an information flow proceeding backward in time between second state vectors of the dynamic system, wherein a second state vector at a respective point in time comprises one or more second entries which are each assigned to an entry of the observable vector, and one or more hidden states of the dynamic system, determining the observable vector by respectively combining the first entries of the first state vector with the second entries of the second state vector for respective points of time before and after a current point of time, wherein the causal network and the retro-causal network are learned based on training data which contains a sequence of consecutive known observable vectors. 2. The method as claimed in claim 1 , wherein, during learning of the causal and retro-causal networks at a respective point in time, for which a known observable vector from the training data exists, the first and second entries of the first and second state vectors are corrected using the difference between the observable vector determined in the recurrent neural network and the known observable vector at the respective point in time, the first and second state vectors with the corrected first and second entries continuing to be used for learning. 3. The method as claimed in claim 1 , wherein the causal network and the retro-causal network are learned based on error-back-propagation with shared weights. 4. The method as claimed in claim 1 , wherein, in the recurrent neural network at a respective point in time, the observable vector is determined such that the respective first and second entries which are assigned to the same entry of the observable vector are added. 5. The method as claimed in claim 1 , wherein, during learning of the causal and retro-causal networks at a respective point in time, for which a known observable vector from the training data exists, a target value is determined which represents the difference vector between the observable vector determined in the recurrent neural network and the known observable vector at the respective point in time, wherein the minimization of the sum of the absolute values or squared absolute values of the difference vectors at the respective points in time, for which a known observable vector from the training data exists, is predefined as the learning optimization target. 6. The method as claimed in claim 1 , wherein in the causal network a first state vector at a respective point in time is converted into a first state vector at a subsequent point in time by multiplication by a matrix assigned to the causal network and the application of an activation function. 7. The method as claimed in claim 6 , wherein first the activation function is applied to the first state vector at the respective point in time and then multiplication by the matrix assigned to the causal network is performed. 8. The method as claimed in claim 1 , wherein in the retro-causal network a second state vector at a respective point in time is converted into a second state vector at a previous point in time by multiplication by a matrix assigned to the retro-causal network and the application of an activation function. 9. The method as claimed in claim 8 , wherein first the activation function is applied to the second state vector at the respective point in time and then multiplication by the matrix assigned to the retro-causal network is performed. 10. The method as claimed in claim 6 , wherein the activation function is a tan h function. 11. The method as claimed in claim 1 , wherein the recurrent neural network is used to model energy price and/or commodity price changes over time. 12. The method as claimed in claim 1 , wherein the recurrent neural network is used to model a technical system. 13. The method as claimed in claim 12 , wherein the technical system is a gas turbine or a wind turbine. 14. A non-transitory computer readable medium comprising program code for carrying out a method when the program is executed on a computer, wherein the method is for computer-aided learning of a recurrent neural network for modeling a dynamic system which is characterized at respective times by an observable vector comprising one or more observables as entries, the method comprising: using a recurrent neural network comprising a causal network, a retro-causal network, an observable vector with one or more observables, wherein the causal network describes an information flow proceeding forward in time between first state vectors of the dynamic system, wherein a first state vector at a respective point in time comprises one or more first entries which are each assigned to an entry of the observable vector, and one or more hidden states of the dynamic system, wherein the retro-causal network describes an information flow proceeding backward in time between second state vectors of the dynamic system, wherein a second state vector at a respective point in time comprises one or more second entries which are each assigned to an entry of the observable vector, and one or more hidden states of the dynamic system, determining the observable vector by respectively combining the first entries of the first state vector with the second entries of the second state vector for respective points of time before and after a current point of time, wherein the causal network and the retro-causal network are learned based on training data which contains a sequence of consecutive known observable vectors. 15. The method according to claim 1 wherein the provided recurrent neural network corresponds at least in part to the following: where: N1 corresponds to the causal network; N2 corresponds to the retro-causal network; t−6 to t+3 correspond to the respective points in time; S t−6 to S t+3 correspond to the first state vector entries at the respective points in time; S′ t−6 to S′ t+3 correspond to the second state vector entries at the respective points in time; Y t−6 to Y t+3 correspond to the observable vector entries at the respective points in time; Y t−6 to Y t correspond to the consecutive known observable vector entries; and [Id,0] corresponds to a filtering matrix.

Assignees

Inventors

Classifications

  • G06N3/084Primary

    Backpropagation, e.g. using gradient descent · CPC title

  • Recurrent networks, e.g. Hopfield networks · CPC title

  • G06N3/08Primary

    Learning methods · CPC title

  • characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title

  • Supervised learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9235800B2 cover?
A method for the computer-aided learning of a recurrent neural network for modeling a dynamic system which is characterized at respective times by an observable vector with one or more observables as entries is provided. The neural network includes both a causal network with a flow of information that is directed forwards in time and a retro-causal network with a flow of information which is di…
Who is the assignee on this patent?
Grothmann Ralph, Tietz Christoph, Zimmermann Hans-Georg, and 1 more
What technology area does this patent fall under?
Primary CPC classification G06N3/084. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 12 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).