Reconfigurable logic architecture
US-9577644-B2 · Feb 21, 2017 · US
US11119677B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11119677-B2 |
| Application number | US-201815916228-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 8, 2018 |
| Priority date | Dec 15, 2017 |
| Publication date | Sep 14, 2021 |
| Grant date | Sep 14, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A storage device and method of controlling a storage device are disclosed. The storage device includes a host, a logic die, and a high bandwidth memory stack including a memory die. A computation lookup table is stored on a memory array of the memory die. The host sends a command to perform an operation utilizing a kernel and a plurality of input feature maps, includes finding the product of a weight of the kernel and values of multiple input feature maps. The computation lookup table includes a row corresponding to a weight of the kernel, and a column corresponding to a value of the input feature maps. A result value stored at a position corresponding to a row and a column is the product of the weight corresponding to the row and the value corresponding to the column.
Opening claim text (preview).
What is claimed is: 1. A storage device comprising: a host that sends a command to perform an operation utilizing a kernel and a plurality of input feature maps, the kernel comprising a plurality of weights, the input feature maps comprising a plurality of values, the operation comprising determining a product of a first weight of the kernel and a first value of the plurality of values of each of two or more of the plurality of input feature maps; a logic die coupled to the host and configured to receive the command; and a high bandwidth memory (HBM) stack comprising a memory die coupled to the logic die comprising a memory array, wherein: the kernel and the plurality of input feature maps are stored in the memory array, a computation lookup table is stored in the memory array, the computation lookup table having a plurality of rows, a row of the plurality of rows corresponding to one of the plurality of weights of the kernel, the computation lookup table having a plurality of columns, a column of the plurality of columns corresponding to one of the values of the plurality of input feature maps, and a result value is stored at a position in the computation lookup table, the position corresponding to a row of the plurality of rows and a column of the plurality of columns, the result value being the product of the weight corresponding to the row and the value corresponding to the column. 2. The storage device of claim 1 , further comprising: a first row decoder, wherein the memory die enters the first weight into the first row decoder to load the row of the computation lookup table corresponding to the first weight into a row buffer, and a second row decoder, wherein the memory die enters the first weight into the second row decoder to load the first value of each of the two or more of the plurality of input feature maps into an intermediate buffer. 3. The storage device of claim 2 , further comprising: a column access scheduler and a column decoder, wherein the column access scheduler is configured to receive the first value of each of the two or more of the plurality of input feature maps from the intermediate buffer and, for each first value of each of the two or more of the plurality of input feature maps, to control the column decoder to access the result value at a position in the row buffer corresponding to the column corresponding to the first value, and to output the result value to a read buffer. 4. The storage device of claim 3 , wherein the logic die comprises a processing element, wherein the read buffer, upon receiving the result value for each first value in the intermediate buffer, outputs the result value for each first value to the logic die, and the processing element is configured to processes the result value for each first value. 5. The storage device of claim 4 , wherein the processing element is configured to receive the result value corresponding to the first value corresponding to a first input feature map, and is configured to combine the received result value with other received result values for the first input feature map to generate an output value for the first input feature map. 6. The storage device of claim 1 , wherein: the host sends a second command to perform a second operation utilizing a second kernel and a second plurality of input feature maps, a second computation lookup table is stored in the memory array, the second computation lookup table having a plurality of rows, a row of the plurality of rows of the second computation lookup table corresponding to one of the plurality of weights of the second kernel, the second computation lookup table having a plurality of columns, a column of the plurality of columns of the second computation lookup table corresponding to one of the values of the second plurality of input feature maps, and a result value is stored at a position in the second computation lookup table, the position corresponding to a row of the plurality of rows and a column of the plurality of columns of the second computation lookup table, the result value being the product of the weight corresponding to the row and the value corresponding to the column. 7. The storage device of claim 6 , wherein the second command is to perform a convolution operation, convolving the second kernel with two or more of the second plurality of input feature maps. 8. The storage device of claim 1 , wherein the command is to perform a matrix multiplication operation, multiplying the kernel with the two or more of the plurality of input feature maps. 9. The storage device of claim 1 , wherein the storage device is configured to store the kernel, the plurality of input feature maps, and the computation lookup table in the memory array based on the command. 10. The storage device of claim 9 , wherein a percentage of the memory array allocated to the computation lookup table is based on the operation identified by the command. 11. A method of controlling a memory device, the method comprising: sending a command to a logic die to perform an operation utilizing a kernel and a plurality of input feature maps, the kernel comprising a plurality of weights, the input feature maps comprising a plurality of values, the operation comprising determining a product of a first weight of the kernel and a first value of the plurality of values of the two or more of the plurality of input feature maps; storing the kernel and the plurality of input feature maps in a memory array; storing a computation lookup table in the memory array, the computation lookup table having a plurality of rows, a row of the plurality of rows corresponding to one of the plurality of weights of the kernel, the computation lookup table having a plurality of columns, a column of the plurality of columns corresponding to one of the values of the plurality of input feature maps, wherein a result value is stored at a position in the computation lookup table, the position corresponding to a row of the plurality of rows and a column of the plurality of columns, the result value being the product of the weight corresponding to the row and the value corresponding to the column. 12. The method of claim 11 , further comprising: entering the first weight into a first row decoder to load the row of the computation lookup table corresponding to the first weight into a row buffer, and entering the first weight into a second row decoder to load the first value of the two or more of the plurality of input feature maps into an intermediate buffer. 13. The method of claim 12 , further comprising: receiving the first value of each of the two or more of the plurality of input feature maps from the intermediate buffer, for each first value of each of the two or more of the plurality of input feature maps, accessing the result value at a position in the row buffer corresponding to the column corresponding to the first value, and outputting the result value to a read buffer. 14. The method of claim 13 , further comprising: outputting the result value for each first value to the logic die upon receiving the result value for each of the first values in the intermediate buffer, and processing the result value for each first value by a processing element. 15. The method of claim 14 , wherein processing by the processing element is receiving the result value corresponding to the first value corresponding to a first input feature map, and combining the received result value with other received result values for the first input feature map to generate an output value for the first input feature map. 16. The method of
Combinations of networks · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
Free address space management · CPC title
considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration (scheduling strategies G06F9/4881 and subgroups) · CPC title
Improving I/O performance · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.