What technology area does this patent fall under?

Primary CPC classification G06F9/5038. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Sep 14 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).

HBM based memory lookup engine for deep learning accelerator

US11119677B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11119677-B2
Application number	US-201815916228-A
Country	US
Kind code	B2
Filing date	Mar 8, 2018
Priority date	Dec 15, 2017
Publication date	Sep 14, 2021
Grant date	Sep 14, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A storage device and method of controlling a storage device are disclosed. The storage device includes a host, a logic die, and a high bandwidth memory stack including a memory die. A computation lookup table is stored on a memory array of the memory die. The host sends a command to perform an operation utilizing a kernel and a plurality of input feature maps, includes finding the product of a weight of the kernel and values of multiple input feature maps. The computation lookup table includes a row corresponding to a weight of the kernel, and a column corresponding to a value of the input feature maps. A result value stored at a position corresponding to a row and a column is the product of the weight corresponding to the row and the value corresponding to the column.

First claim

Opening claim text (preview).

What is claimed is: 1. A storage device comprising: a host that sends a command to perform an operation utilizing a kernel and a plurality of input feature maps, the kernel comprising a plurality of weights, the input feature maps comprising a plurality of values, the operation comprising determining a product of a first weight of the kernel and a first value of the plurality of values of each of two or more of the plurality of input feature maps; a logic die coupled to the host and configured to receive the command; and a high bandwidth memory (HBM) stack comprising a memory die coupled to the logic die comprising a memory array, wherein: the kernel and the plurality of input feature maps are stored in the memory array, a computation lookup table is stored in the memory array, the computation lookup table having a plurality of rows, a row of the plurality of rows corresponding to one of the plurality of weights of the kernel, the computation lookup table having a plurality of columns, a column of the plurality of columns corresponding to one of the values of the plurality of input feature maps, and a result value is stored at a position in the computation lookup table, the position corresponding to a row of the plurality of rows and a column of the plurality of columns, the result value being the product of the weight corresponding to the row and the value corresponding to the column. 2. The storage device of claim 1 , further comprising: a first row decoder, wherein the memory die enters the first weight into the first row decoder to load the row of the computation lookup table corresponding to the first weight into a row buffer, and a second row decoder, wherein the memory die enters the first weight into the second row decoder to load the first value of each of the two or more of the plurality of input feature maps into an intermediate buffer. 3. The storage device of claim 2 , further comprising: a column access scheduler and a column decoder, wherein the column access scheduler is configured to receive the first value of each of the two or more of the plurality of input feature maps from the intermediate buffer and, for each first value of each of the two or more of the plurality of input feature maps, to control the column decoder to access the result value at a position in the row buffer corresponding to the column corresponding to the first value, and to output the result value to a read buffer. 4. The storage device of claim 3 , wherein the logic die comprises a processing element, wherein the read buffer, upon receiving the result value for each first value in the intermediate buffer, outputs the result value for each first value to the logic die, and the processing element is configured to processes the result value for each first value. 5. The storage device of claim 4 , wherein the processing element is configured to receive the result value corresponding to the first value corresponding to a first input feature map, and is configured to combine the received result value with other received result values for the first input feature map to generate an output value for the first input feature map. 6. The storage device of claim 1 , wherein: the host sends a second command to perform a second operation utilizing a second kernel and a second plurality of input feature maps, a second computation lookup table is stored in the memory array, the second computation lookup table having a plurality of rows, a row of the plurality of rows of the second computation lookup table corresponding to one of the plurality of weights of the second kernel, the second computation lookup table having a plurality of columns, a column of the plurality of columns of the second computation lookup table corresponding to one of the values of the second plurality of input feature maps, and a result value is stored at a position in the second computation lookup table, the position corresponding to a row of the plurality of rows and a column of the plurality of columns of the second computation lookup table, the result value being the product of the weight corresponding to the row and the value corresponding to the column. 7. The storage device of claim 6 , wherein the second command is to perform a convolution operation, convolving the second kernel with two or more of the second plurality of input feature maps. 8. The storage device of claim 1 , wherein the command is to perform a matrix multiplication operation, multiplying the kernel with the two or more of the plurality of input feature maps. 9. The storage device of claim 1 , wherein the storage device is configured to store the kernel, the plurality of input feature maps, and the computation lookup table in the memory array based on the command. 10. The storage device of claim 9 , wherein a percentage of the memory array allocated to the computation lookup table is based on the operation identified by the command. 11. A method of controlling a memory device, the method comprising: sending a command to a logic die to perform an operation utilizing a kernel and a plurality of input feature maps, the kernel comprising a plurality of weights, the input feature maps comprising a plurality of values, the operation comprising determining a product of a first weight of the kernel and a first value of the plurality of values of the two or more of the plurality of input feature maps; storing the kernel and the plurality of input feature maps in a memory array; storing a computation lookup table in the memory array, the computation lookup table having a plurality of rows, a row of the plurality of rows corresponding to one of the plurality of weights of the kernel, the computation lookup table having a plurality of columns, a column of the plurality of columns corresponding to one of the values of the plurality of input feature maps, wherein a result value is stored at a position in the computation lookup table, the position corresponding to a row of the plurality of rows and a column of the plurality of columns, the result value being the product of the weight corresponding to the row and the value corresponding to the column. 12. The method of claim 11 , further comprising: entering the first weight into a first row decoder to load the row of the computation lookup table corresponding to the first weight into a row buffer, and entering the first weight into a second row decoder to load the first value of the two or more of the plurality of input feature maps into an intermediate buffer. 13. The method of claim 12 , further comprising: receiving the first value of each of the two or more of the plurality of input feature maps from the intermediate buffer, for each first value of each of the two or more of the plurality of input feature maps, accessing the result value at a position in the row buffer corresponding to the column corresponding to the first value, and outputting the result value to a read buffer. 14. The method of claim 13 , further comprising: outputting the result value for each first value to the logic die upon receiving the result value for each of the first values in the intermediate buffer, and processing the result value for each first value by a processing element. 15. The method of claim 14 , wherein processing by the processing element is receiving the result value corresponding to the first value corresponding to a first input feature map, and combining the received result value with other received result values for the first input feature map to generate an output value for the first input feature map. 16. The method of

Assignees

Samsung Electronics Co Ltd

Inventors

Classifications

G06N3/045
Combinations of networks · CPC title
G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title
G06F12/023
Free address space management · CPC title
G06F9/5038Primary
considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration (scheduling strategies G06F9/4881 and subgroups) · CPC title
G06F3/061
Improving I/O performance · CPC title

Patent family

Related publications grouped by family.

View patent family 66815998

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11119677B2 cover?: A storage device and method of controlling a storage device are disclosed. The storage device includes a host, a logic die, and a high bandwidth memory stack including a memory die. A computation lookup table is stored on a memory array of the memory die. The host sends a command to perform an operation utilizing a kernel and a plurality of input feature maps, includes finding the product of a …
Who is the assignee on this patent?: Samsung Electronics Co Ltd
What technology area does this patent fall under?: Primary CPC classification G06F9/5038. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Sep 14 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).