System and method for residual long short term memories (LSTM) network
US-10810482-B2 · Oct 20, 2020 · US
US12530225B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12530225-B2 |
| Application number | US-202117473808-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 13, 2021 |
| Priority date | Sep 13, 2021 |
| Publication date | Jan 20, 2026 |
| Grant date | Jan 20, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A computer-implemented method, system and computer program product for processing data. Data, including single data points (e.g., images) or entire sequences of data (e.g., speech, video), is received to be processed. A long short term memory structure is utilized to process the received data, where the long short term memory structure includes hidden state sharing modules for allowing information sharing in hidden states across different tasks. The hidden state sharing modules include broadcast modules which are configured to send hidden states of the current task to all previous modules and collect modules which are configured to collect all the hidden states from all the previous modules. In this manner, catastrophic forgetting is avoided by preventing the loss of previously learned information via the use of hidden state sharing modules.
Opening claim text (preview).
The invention claimed is: 1 . A computer-implemented method for preventing catastrophic forgetting, the method comprising: receiving data; and processing said received data by utilizing a long short term memory structure, wherein said long short term memory structure comprises hidden state sharing modules for allowing information sharing in hidden states across different tasks, wherein said hidden state sharing modules broadcast hidden states to all previous modules and collect hidden states from all said previous modules thereby preventing a loss of previously learned information so as to avoid catastrophic forgetting. 2 . The method as recited in claim 1 , wherein said hidden state sharing modules comprise a first module configured to send hidden states of a task to all said previous modules. 3 . The method as recited in claim 1 , wherein said hidden state sharing modules comprise a second module configured to collect all hidden states from all said previous modules. 4 . The method as recited in claim 1 , wherein said data comprises a data set for a first task, wherein the method further comprises: updating model parameters of a first task-oriented module with said data set for said first task in response to processing said data set for said first task, wherein said model parameters of said first task-oriented module comprise a matrix and a bias, wherein said first task-oriented module comprises computational blocks that control information flow. 5 . The method as recited in claim 4 , wherein said data comprises a data set for a second task which is subsequent to said first task, wherein the method further comprises: immobilizing changes to said model parameters of said first task-oriented module in response to processing said second task; and creating a second task-oriented module for said second task in response to processing said second task, wherein said second task-oriented module comprises computational blocks that control information flow. 6 . The method as recited in claim 5 further comprising: creating a first hidden state sharing module of said hidden state sharing modules configured to send hidden states of said second task to all said previous modules in response to processing said second task; and creating a second hidden state sharing module of said hidden state sharing modules configured to collect all hidden states from all said previous modules in response to processing said second task. 7 . The method as recited in claim 1 further comprising: obtaining an output hidden state of said long short term memory structure by summing hidden states of all modules of said long short term memory structure. 8 . A computer program product for preventing catastrophic forgetting, the computer program product comprising one or more computer readable storage mediums having program code embodied therewith, the program code comprising programming instructions for: receiving data; and processing said received data by utilizing a long short term memory structure, wherein said long short term memory structure comprises hidden state sharing modules for allowing information sharing in hidden states across different tasks, wherein said hidden state sharing modules broadcast hidden states to all previous modules and collect hidden states from all said previous modules thereby preventing a loss of previously learned information so as to avoid catastrophic forgetting. 9 . The computer program product as recited in claim 8 , wherein said hidden state sharing modules comprise a first module configured to send hidden states of a task to all said previous modules. 10 . The computer program product as recited in claim 8 , wherein said hidden state sharing modules comprise a second module configured to collect all hidden states from all said previous modules. 11 . The computer program product as recited in claim 8 , wherein said data comprises a data set for a first task, wherein the program code further comprises the programming instructions for: updating model parameters of a first task-oriented module with said data set for said first task in response to classifying, processing or making predictions using said data set for said first task, wherein said model parameters of said first task-oriented module comprise a matrix and a bias, wherein said first task-oriented module comprises computational blocks that control information flow. 12 . The computer program product as recited in claim 11 , wherein said data comprises a data set for a second task which is subsequent to said first task, wherein the program code further comprises the programming instructions for: immobilizing changes to said model parameters of said first task-oriented module in response to processing said second task; and creating a second task-oriented module for said second task in response to processing said second task, wherein said second task-oriented module comprises computational blocks that control information flow. 13 . The computer program product as recited in claim 12 , wherein the program code further comprises the programming instructions for: creating a first hidden state sharing module of said hidden state sharing modules configured to send hidden states of said second task to all said previous modules in response to processing said second task; and creating a second hidden state sharing module of said hidden state sharing modules configured to collect all hidden states from all said previous modules in response to processing said second task. 14 . The computer program product as recited in claim 8 , wherein the program code further comprises the programming instructions for: obtaining an output hidden state of said long short term memory structure by summing hidden states of all modules of said long short term memory structure. 15 . A system, comprising: a memory for storing a computer program for preventing catastrophic forgetting; and a processor connected to said memory, wherein said processor is configured to execute program instructions of the computer program comprising: receiving data; and processing said received data by utilizing a long short term memory structure, wherein said long short term memory structure comprises hidden state sharing modules for allowing information sharing in hidden states across different tasks, wherein said hidden state sharing modules broadcast hidden states to all previous modules and collect hidden states from all said previous modules thereby preventing a loss of previously learned information so as to avoid catastrophic forgetting. 16 . The system as recited in claim 15 , wherein said hidden state sharing modules comprise a first module configured to send hidden states of a task to all said previous modules. 17 . The system as recited in claim 15 , wherein said hidden state sharing modules comprise a second module configured to collect all hidden states from all said previous modules. 18 . The system as recited in claim 15 , wherein said data comprises a data set for a first task, wherein the program instructions of the computer program further comprise: updating model parameters of a first task-oriented module with said data set for said first task in response to classifying, processing or making predictions using said data set for said first task, wherein said model parameters of said first task-oriented module comprise a matrix and a bias, wherein said first task-oriented module comprises computational blocks that control information flow. 19 . The system as
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
Transfer learning · CPC title
Architecture, e.g. interconnection topology · CPC title
Learning methods · CPC title
Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.