Memory access control device and control method of memory access
US-2018329841-A1 · Nov 15, 2018 · US
US2020402198A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2020402198-A1 |
| Application number | US-201916449545-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jun 24, 2019 |
| Priority date | Jun 24, 2019 |
| Publication date | Dec 24, 2020 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A general-purpose graphics processor comprising a first set of compute units, a second set of compute units, and a memory coupled with the first set of compute units and the second set of compute units is described. The memory is configured to merge a first read request to an address block of the memory with a second read request to the address block of the memory to reduce a number of memory accesses to a memory bank associated with the address block. The graphics processor can also include a memory arbiter that can multicast merged reads to the compute units associated with the merged reads.
Opening claim text (preview).
What is claimed is: 1 . A general-purpose graphics processor comprising: a first set of compute units; a second set of compute units; and a memory coupled with the first set of compute units and the second set of compute units, wherein the memory is to merge a first read request to an address block of the memory with a second read request to the address block of the memory to reduce a number of memory accesses to a memory bank associated with the address block. 2 . The general-purpose graphics processor as in claim 1 , wherein the memory includes a scoreboard to store a first entry for the first read request and a second entry for the second read request and the memory is to merge the first read request with the second read request upon determination that the first entry and the second entry have a matching address block. 3 . The general-purpose graphics processor as in claim 2 , wherein the first entry for the first read request and the second entry for the second read request are to store an identifier for a compute unit or thread associated with the respective read request. 4 . The general-purpose graphics processor as in claim 3 , wherein the first read request is associated with a first thread of the first set of compute units and the second read request is associated with a second thread of the first set of compute units. 5 . The general-purpose graphics processor as in claim 3 , wherein the first read request is associated with a first thread of the first set of compute units and the second read request is associated with a second thread of the second set of compute units. 6 . The general-purpose graphics processor as in claim 1 , further comprising a memory arbiter coupled with the first set of compute units and the second set of compute units, wherein the memory is coupled with the first set of compute units and the second set of compute units via the memory arbiter. 7 . The general-purpose graphics processor as in claim 6 , wherein the memory arbiter includes a thread dispatch buffer and a thread dispatch bus to dispatch threads for execution to the first set of compute units and the second set of compute units. 8 . The general-purpose graphics processor as in claim 6 , wherein the memory arbiter is to multicast a read result for the first read request and the second read request. 9 . The general-purpose graphics processor as in claim 8 , wherein the memory is to send a bitmask and a read return message to the memory arbiter, the read return message includes the read result for the first read request and the second read request, and the bitmask indicates compute units or threads associated with the read return message. 10 . The general-purpose graphics processor as in claim 9 , wherein the memory arbiter is to multicast the read result for the first read request and the second read request based on the bitmask. 11 . The general-purpose graphics processor as in claim 10 , wherein the memory arbiter includes one or more buffers to store multicast data before the multicast data is received by the compute units, wherein data is maintained in the one or more buffers until each compute unit associated with a multicast receives the multicast data. 12 . A method comprising: receiving a first read request at a memory, the memory shared between a first set of compute units and a second set of compute units of a general-purpose graphics processor; receiving a second read request at the memory while the first read request is pending; determining that an address block associated with the first read request matches the address block associated with the second read request; and merging the second read request with the first read request. 13 . The method as in claim 12 , wherein merging the second read request with the first read request includes performing one access to one or more memory banks of the memory for the first read request and the second read request. 14 . The method as in claim 13 , further comprising determining that an address block associated with the first read request differs from the address block associated with the second read request and performing separate accesses to the one or more memory banks for the first read request and the second read request. 15 . The method as in claim 12 , further comprising transmitting a read return for a merged request, the read return including a merge mask to indicate multiple recipients for the read return. 16 . The method as in claim 15 , wherein the merge mask is a bit mask that identifies multiple compute units or threads associated with the read return. 17 . The method as in claim 16 , further comprising: receiving the read return for the merged request at a memory arbiter coupled with the multiple compute units or threads associated with the read return; and multicasting the read return to the multiple compute units or threads associated with the read return. 18 . A general-purpose graphics processing system comprising: a memory device; and a general-purpose graphics processor coupled with the memory device, wherein the general-purpose graphics processor includes: a first set of compute units; a second set of compute units; a memory coupled with the first set of compute units and the second set of compute units, wherein the memory is to merge a first read request to an address block of the memory with a second read request to the address block of the memory to reduce a number of memory accesses to a memory bank associated with the address block; and a memory arbiter coupled with the first set of compute units, the second set of compute units and the memory, wherein the memory arbiter is to receive a read return message from the memory, the read return message including read data for the first read request and the second read request and transmit the read return message to compute units within the first set of compute units or second set of compute units associated with the first read request and the second read request. 19 . The general-purpose graphics processing system as in claim 18 , wherein the memory arbiter is to transmit the read return message to the compute units within the first set of compute units or second set of compute units associated with the first read request and the second read request as a single multicast message. 20 . The general-purpose graphics processing system as in claim 19 , wherein the memory arbiter is to transmit the single multicast message to compute units identified by a bitmask received from the memory.
Energy efficient computing, e.g. low power processors, power management or thermal management · CPC title
Image or video data · CPC title
Power efficiency · CPC title
for multiprocessing or multitasking · CPC title
with a shared cache · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.