Neural network unit that interrupts processing core upon condition
US-2018276035-A1 · Sep 27, 2018 · US
US11087203B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11087203-B2 |
| Application number | US-201715618415-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 9, 2017 |
| Priority date | Nov 10, 2016 |
| Publication date | Aug 10, 2021 |
| Grant date | Aug 10, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The present application discloses a method and apparatus for processing a data sequence. A specific implementation of the method includes: receiving an inputted to-be-processed data sequence; copying a weight matrix in a recurrent neural network model to an embedded block random access memory (RAM) of a field-programmable gate array (FPGA); processing sequentially each piece of to-be-processed data in the to-be-processed data sequence by using an activation function in the recurrent neural network model and the weight matrix stored in the embedded block RAM; and outputting a processed data sequence corresponding to the to-be-processed data sequence. This implementation improves the data sequence processing efficiency of the recurrent neural network model.
Opening claim text (preview).
What is claimed is: 1. A method for processing a data sequence, comprising: receiving an inputted to-be-processed data sequence; copying a weight matrix in a recurrent neural network model to an embedded block random access memory (RAM) of a field-programmable gate array (FPGA); processing sequentially each piece of to-be-processed data in the to-be-processed data sequence by using an activation function in the recurrent neural network model and the weight matrix stored in the embedded block RAM; and outputting a processed data sequence corresponding to the to-be-processed data sequence, wherein before the copying, the method further comprises: calling an address assignment interface to assign a storage address in the embedded block RAM to the weight matrix, and wherein the copying comprises: calling a copying interface to copy the weight matrix stored in a double data rate synchronous dynamic random access memory to the storage address in the embedded block RAM that is assigned to the weight matrix, wherein the copying of the weight matrix is performed only once during the process of processing the to-be-processed data sequence. 2. The method according to claim 1 , further comprising: deleting the weight matrix stored in the embedded block RAM after the processed data sequence is output. 3. The method according to claim 2 , wherein the deleting the weight matrix stored in the embedded block RAM comprises: calling a deletion interface to delete the weight matrix stored in the embedded block RAM. 4. The method according to claim 1 , wherein the embedded block RAM is a static random access memory. 5. An apparatus for processing a data sequence, comprising: at least one processor; and a memory storing instructions, which when executed by the at least one processor, cause the at least one processor to perform operations, the operations comprising: receiving an inputted to-be-processed data sequence; copying a weight matrix in a recurrent neural network model to an embedded block random access memory (RAM) of a field-programmable gate array (FPGA); processing sequentially each piece of to-be-processed data in the to-be-processed data sequence by using an activation function in the recurrent neural network model and the weight matrix stored in the embedded block RAM; and outputting a processed data sequence corresponding to the to-be-processed data sequence, wherein, the operations further comprises, before the copying: calling an address assignment interface to assign a storage address in the embedded block RAM to the weight matrix, and wherein the copying comprises; calling a copying interface to copy the weight matrix stored in a double data rate synchronous dynamic random access memory to the storage address in the embedded block RAM that is assigned to the weight matrix, wherein the copying of the weight matrix is performed only once during the process of processing the to-be-processed data sequence. 6. The apparatus according to claim 5 , wherein the operations further comprises: deleting the weight matrix stored in the embedded block RAM after the processed data sequence is output. 7. The apparatus according to claim 6 , wherein the deleting the weight matrix stored in the embedded block RAM comprises: calling a deletion interface to delete the weight matrix stored in the embedded block RAM. 8. The apparatus according to claim 5 , wherein the embedded block RAM is a static random access memory. 9. A non-transitory storage medium storing one or more programs, the one or more programs when executed by an apparatus, causing the apparatus to perform operations, the operations comprising: receiving an inputted to-be-processed data sequence; copying a weight matrix in a recurrent neural network model to an embedded block random access memory (RAM) of a field-programmable gate array (FPGA); processing sequentially each piece of to-be-processed data in the to-be-processed data sequence by using an activation function in the recurrent neural network model and the weight matrix stored in the embedded block RAM; and outputting a processed data sequence corresponding to the to-be-processed data sequence, wherein the operations further comprises, before the copying: calling an address assignment interface to assign a storage address in the embedded block RAM to the weight matrix, and wherein the copying comprises: calling a copying interface to copy the weight matrix stored in a double data rate synchronous dynamic random access memory to the storage address in the embedded block RAM that is assigned to the weight matrix, wherein the copying of the weight matrix is performed only once during the process of processing the to-be-processed data sequence. 10. The non-transitory storage medium according to claim 9 , wherein the operations further comprises: deleting the weight matrix stored in the embedded block RAM after the processed data sequence is output. 11. The non-transitory storage medium according to claim 10 , wherein the deleting the weight matrix stored in the embedded block RAM comprises: calling a deletion interface to delete the weight matrix stored in the embedded block RAM. 12. The non-transitory storage medium according to claim 9 , wherein the embedded block RAM is a static random access memory.
Recurrent networks, e.g. Hopfield networks · CPC title
using electronic means · CPC title
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.