Neural network unit with output buffer feedback and masking capability with processing unit groups that operate as recurrent neural network LSTM cells

US10346351B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10346351-B2
Application numberUS-201615090829-A
CountryUS
Kind codeB2
Filing dateApr 5, 2016
Priority dateOct 8, 2015
Publication dateJul 9, 2019
Grant dateJul 9, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An output buffer holds N words arranged as N/J mutually exclusive output buffer word groups (OBWG) of J words each. N processing units (PU) are arranged as N/J mutually exclusive PU groups each having an associated OBWG. Each PU has an accumulator, arithmetic unit, and first and second multiplexed registers each having at least J+1 inputs. A first input receives a memory operand and the other J inputs receive the J words of the associated OBWG. Each accumulator provides its output to a respective OBWG. Each arithmetic unit performs an operation on the first and second multiplexed register outputs and accumulator output to generate a result for accumulation into the accumulator. A mask input to the output buffer controls which words, if any, of the N words retain their current value or are updated with their respective accumulator output. Each PU group operates as a recurrent neural network LSTM cell.

First claim

Opening claim text (preview).

The invention claimed is: 1. An apparatus, comprising: an output buffer that holds N words arranged as N/J mutually exclusive output buffer word groups of J words each of the N words, J is greater than 2 and N is at least twice J; an array of N processing units (PU) arranged as N/J mutually exclusive PU groups of J PUs each of the N PUs, each PU group of the N/J PU groups has an associated output buffer word group of the N/J output buffer word groups, each PU having: first and second multiplexed registers each having: at least J+1 inputs, a first input of the J+1 inputs receives an operand from a memory and the other J inputs receive the J words of the associated output buffer word group; an output; and a control input that controls selection of the J+1 inputs for provision on the output; an accumulator having an output for provision to a respective one of the N output buffer words; and an arithmetic unit having first and second inputs to receive the output of the first and second multiplexed registers, respectively, and a third input that receives the accumulator output, the arithmetic unit performs an operation on the first, second and third inputs to generate a result for accumulation into the accumulator; the output buffer includes a mask input that controls which words, if any, of the N words retain their current value or are updated with their respective accumulator output; and each PU group of the N/J PU groups of J PUs operates as a Long Short Term Memory (LSTM) cell of a recurrent neural network, a first of the J PUs computes an input gate, a second of the J PUs computes a forget gate, and a third of the J PUs computes an output gate of the LSTM cell. 2. The apparatus of claim 1 , further comprising: the mask input specifies to update first, second and third of the J words of the associated output buffer word group with the input gate, forget gate and output gates computed by the respective first, second and third of the J PUs. 3. The apparatus of claim 2 , further comprising: the first, second and third of the J PUs compute the input gate, forget gate and output gates concurrently. 4. The apparatus of claim 2 , further comprising: a fourth of the J PUs computes a candidate state of the LSTM cell. 5. The apparatus of claim 4 , further comprising: the mask input specifies to update a fourth of the J words of the associated output buffer word group with the candidate state of the LSTM cell but to retain the current value of the first, second and third of the J words of the associated output buffer word group. 6. The apparatus of claim 4 , further comprising: one of the J PUs computes the new state of the LSTM cell and an activation function thereof using the input gate, the forget gate, the candidate state of the LSTM cell, and a current state of the LSTM cell. 7. The apparatus of claim 6 , further comprising: a memory from which the one of the J PUs reads the current state of the LSTM cell and to which the output buffer writes the new state of the LSTM cell. 8. The apparatus of claim 6 , further comprising: one of the J PUs computes a new output of the LSTM cell using the output gate and the activation function of the new state of the LSTM cell. 9. The apparatus of claim 8 , further comprising: a memory from which the J PUs read a current output of the LSTM cell and to which the output buffer writes the new output of the LSTM cell. 10. The apparatus of claim 1 , further comprising: the first, second and third of the J PUs compute the input gate, forget gate and output gate, respectively, using a current output of the LSTM cell and respective weights and using a new input to the LSTM cell and respective weights. 11. The apparatus of claim 10 , further comprising: the first, second and third of the J PUs read the current output from the output buffer. 12. The apparatus of claim 10 , further comprising: a memory from which the first, second and third of the J PUs read the new input. 13. The apparatus of claim 10 , further comprising: a memory from which the first, second and third of the J PUs read the weights. 14. A method for operating an apparatus having an output buffer that holds N words arranged as N/J mutually exclusive output buffer word groups of J words each of the N words, J is greater than 2 and N is at least twice J, an array of N processing units (PU) arranged as N/J mutually exclusive PU groups of J PUs each of the N PUs, each PU group of the N/J PU groups has an associated output buffer word group of the N/J output buffer word groups, the output buffer includes a mask input that controls which words, if any, of the N words retain their current value or are updated with their respective accumulator output, each PU has first and second multiplexed registers each having an output, an accumulator having an output for provision to a respective one of the N output buffer words, and an arithmetic unit having first and second inputs to receive the output of the first and second multiplexed registers, respectively, and a third input that receives the accumulator output, the arithmetic unit performs an operation on the first, second and third inputs to generate a result for accumulation into the accumulator, each of the first and second multiplexed registers has at least J+1 inputs, a first input of the J+1 inputs receives an operand from a memory and the other J inputs receive the J words of the associated output buffer word group, an output, and a control input that controls selection of the J+1 inputs for provision on the output, the method comprising: by each PU group of the N/J PU groups of J PUs, operating as a Long Short Term Memory (LSTM) cell of a recurrent neural network by: computing, by a first of the J PUs, an input gate of the LSTM cell; computing, by a second of the J PUs, a forget gate of the LSTM cell; and computing, by a third of the J PUs, an output gate of the LSTM cell. 15. The method of claim 14 , further comprising: specifying, by the mask input, to update first, second and third of the J words of the associated output buffer word group with the input gate, forget gate and output gates computed by the respective first, second and third of the J PUs. 16. The method of claim 15 , further comprising: computing, by the first, second and third of the J PUs, the input gate, forget gate and output gates concurrently. 17. The method of claim 15 , further comprising: computing, by a fourth of the J PUs, a candidate state of the LSTM cell. 18. The method of claim 17 , further comprising: specifying, by the mask input, to update a fourth of the J words of the associated output buffer word group with the candidate state of the LSTM cell but to retain the current value of the first, second and third of the J words of the associated output buffer word group. 19. The method of claim 17 , further comprising: computing, by one of the J PUs, the new state of the LSTM cell and an activation function thereof using the input gate, the forget gate, the candidate state of the LSTM cell, and a current state of the LSTM cell. 20. The method of claim 19 , further comprising: reading, by the one of the J PUs, from a memory the current state of the LSTM cell; and writing, by the output buffer, to the memory the new state of the LSTM cell. 21. The method of claim 19 , further comprising: computing, by one of the J PUs, a new output of the LSTM cell using the output gate and the activation function of the new state of the LSTM cell. 22. The method of claim 21 , further comprising: reading, by the J

Assignees

Inventors

Classifications

  • Recurrent networks, e.g. Hopfield networks · CPC title

  • Analogue means · CPC title

  • Combinations of networks · CPC title

  • Program or instruction counter, e.g. incrementing · CPC title

  • using electronic means · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10346351B2 cover?
An output buffer holds N words arranged as N/J mutually exclusive output buffer word groups (OBWG) of J words each. N processing units (PU) are arranged as N/J mutually exclusive PU groups each having an associated OBWG. Each PU has an accumulator, arithmetic unit, and first and second multiplexed registers each having at least J+1 inputs. A first input receives a memory operand and the other J…
Who is the assignee on this patent?
Via Alliance Semiconductor Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06F9/30032. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 09 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).