Memory with processing in memory architecture and operating method thereof
US-2020117597-A1 · Apr 16, 2020 · US
US11526287B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11526287-B2 |
| Application number | US-202016828170-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 24, 2020 |
| Priority date | Nov 1, 2019 |
| Publication date | Dec 13, 2022 |
| Grant date | Dec 13, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A storage device is provided including a memory controller having a neural processing unit (NPU); a first nonvolatile memory (NVM) connected to the memory controller through a first channel; and a second NVM connected to the memory controller through a second channel. The first NVM stores first weight data for the NPU and the second stores second weight data for the NPU. The memory controller is configured to determine one of the first and second channels that is less frequently accessed upon receiving an inference request from the neural processor, and access a corresponding one of the first weight data and the second weight data using the determined one channel.
Opening claim text (preview).
What is claimed is: 1. A storage device comprising: a memory controller comprising a neural processing unit (NPU); a first nonvolatile memory (NVM) connected to the memory controller through a first channel, the first NVM storing first weight data for the NPU; a second NVM connected to the memory controller through a second channel, the second NVM storing second weight data for the NPU, wherein the memory controller is configured to determine one of the first and second channels that is less frequently accessed upon receiving an inference request, and access a corresponding one of the first weight data and the second weight data using the determined one channel in response to determining that the inference request requires accessing weight data for an operation, and wherein the NPU, performs the operation without using any weight data in response to determining that the inference request does not require accessing the weight data for the operation. 2. The storage device of claim 1 , further comprising: a first memory manager connected to the first channel for accessing the first NVM; a second memory manager connected to the second channel for accessing the second NVM; and a scheduler configured to use the first memory manager to access the first weight data and the second memory manager to access the second weight data. 3. The storage device of claim 1 , the first weight data is identical to the second weight data. 4. The storage device of claim 3 , wherein the memory controller is configured to divide a storage area of the first NVM into a first region including first cells capable of storing a first number of bits, and divide a storage area of the second NVM into a second region including second cells capable of storing a second other number of bits, the first region storing the first weight data and the second region storing the second weight data. 5. The storage device of claim 1 , wherein the memory controller is configured to degrade the first weight data into the second weight data having a lower precision. 6. The storage device of claim 1 , wherein the memory controller is configured to compress the first weight data into the second weight data. 7. The storage device of claim 1 , the first weight data is for a first machine learning model and the second weight data is for a second other machine learning model. 8. The storage device of claim 1 , wherein each NVM comprises a plurality of cell strings arranged on a substrate in rows and columns, each of the cell strings including a ground selection transistor connected to a ground selection line GSL, a plurality of memory cells MC respectively connected to a plurality of word lines, and a plurality of string selection transistors respectively connected to a plurality of string selection lines. 9. The storage device of claim 1 , wherein the memory controller further comprises a central processing unit (CPU) for processing an access request from a host device to access data of the NVM other than weight data. 10. The storage device of claim 9 , wherein the memory controller further comprises: a first queue for storing first commands associated with the access request; and a second queue for storing second commands associated with the inference request, wherein the memory controller determines the one channel by comparing a number of the first commands associated with the first channel with a number of the first commands associated with the second channel. 11. The storage device of claim 1 , further comprising a mapping table that maps a virtual address of the inference request to a first physical address within the first NVM and to a second physical address within the second NVM, and the memory controller converts the virtual address to one of the physical addresses associated with the selected channel to perform the access. 12. A memory package comprising: a package substrate; and the storage device of claim 1 , wherein the memory controller is disposed on a first chip on the package substrate, wherein the first NVM is disposed on a second chip on the package substrate, and wherein the second NVM is disposed on a third chip on the package substrate. 13. A storage device comprising: a data bus; a neural processing unit (NPU) connected to the data bus; a first memory manager connected to the data bus and a first channel; a second memory manager connected to the data bus and a second channel; a first nonvolatile memory (NVM) device connected to the first channel, the first NVM storing first weight data for the NPU; a second NVM device connected to the second channel, the second NVM storing second weight data for the NPU; and a third memory manager connected to the NPU and a third NVM device storing third weight data for the NPU, wherein the NPU attempts to access the third weight data using the third memory manager upon receiving an inference request from a host device in response to determining that the inference request requires accessing the weight data for an operation, wherein the storage device attempts to access the first weight using the first memory manager or the second weight data using the second memory manager when the NPU is unable to access the third weight data through the third memory manager, and wherein the NPU performs the operation without using any weight data in response to determining that the inference request does not require accessing the weight data for the operation. 14. The storage device of claim 13 , further comprising a scheduler configured to determine one of the first and second channels that is less frequently accessed, and access a corresponding one of the first weight data and the second weight data using the determined one channel. 15. The storage device of claim 14 , wherein the first weight data, the second weight data, and the third weight data are all identical. 16. The storage device of claim 14 , further comprising a central processing unit (CPU) connected to the data bus for processing an access request from the host device to access data of the NVM devices other than weight data. 17. The storage device of claim 15 , further comprising: a first queue for storing first commands associated with the access request; and a second queue for storing second commands associated with the inference request, wherein the scheduler determines the one channel by comparing a number of the first commands associated with the first channel with a number of the first commands associated with the second channel. 18. A method for operating a storage device including a memory controller and a neural processor, the method comprising: receiving, by the memory controller, a request from a host device, wherein a first nonvolatile memory (NVM) is connected to the memory controller through a first channel and a second NVM is connected to the memory controller through a second channel; determining, by the memory controller, whether the request requires accessing weight data for the neural processor; processing, by the neural processor, the request when it is determined that the request does not require accessing the weight data; and selecting, by the memory controller, one of the channels that is less frequently accessed and accessing the weight data using the selected channel when it is determined that the request requires accessing the weight data. 19. The method of claim 18 , wherein the accessing of the weight data comprises: reading the weight data from the NVM connected to the selected channel; and outputting the read weight da
by changing the path, e.g. traffic rerouting, path reconfiguration · CPC title
Command handling arrangements, e.g. command buffers, queues, command scheduling · CPC title
Non-volatile semiconductor memory device, e.g. flash memory, one time programmable memory [OTP] · CPC title
Non-volatile semiconductor memory arrays · CPC title
using electronic means · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.