Non-blocking memory management unit
US-9652560-B1 · May 16, 2017 · US
US10733688B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10733688-B2 |
| Application number | US-201715716280-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 26, 2017 |
| Priority date | Sep 26, 2017 |
| Publication date | Aug 4, 2020 |
| Grant date | Aug 4, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Embodiments are generally directed to area-efficient implementations of graphics instructions. An embodiment of an apparatus includes a graphics subsystem including one or more of a first logic for processing of memory read-return data for single-instruction-multiple-data instructions; a second logic for assembly of memory read-return data for media block instructions into shader register format; or a third logic to remap scatter or gather instructions to untyped surface instruction types. An embodiment of an apparatus includes a graphics subsystem including a translation lookaside buffer (TLB) and a data port controller to control the TLB, the data port controller including an incoming request pipeline to receive an incoming request with virtual address and generate a response, an incoming response pipeline to receive the response and generate a cache request, and an invalidation flow pipeline.
Opening claim text (preview).
What is claimed is: 1. An apparatus comprising a graphics subsystem, the graphics subsystem including: a graphics data port, the graphics data port including: a translation lookaside buffer (TLB); and a data port controller to control the TLB, the data port controller including: an incoming request pipeline to receive and process incoming translation requests, including a first incoming translation request with virtual address, the first incoming translation request being received from a context having a plurality of different page sizes including a first page size and a second page size, the first page size being larger than the second page size, and, upon a miss in the TLB for the first incoming translation request, generate a cache request, an incoming response pipeline to receive and process a response to the cache request, and an invalidation flow pipeline for translation requests; wherein the data port controller is to provide a fill entry allocation for the TLB upon receiving the response to the cache request, the response to the cache request including a certain page size of the plurality of page sizes, the page size of the response being unavailable to the graphics data port until the response to the cache request is received. 2. The apparatus of claim 1 , wherein the incoming request pipeline includes: a first stage to receive incoming translation requests and, upon determining there is a hit in the TLB for an incoming translation request, determine a new least frequently used (LRU) value and physical address (PA) array index; a second stage to write the incoming translation requests into a pending queue in virtual address content addressable memory (CAM); and a third stage to arbitrate pending queue entries in the pending queue that are ready to be scheduled and to select an oldest request that requires address translation for the cache request. 3. The apparatus of claim 1 , wherein the incoming response pipeline includes: a first stage to receive the responses to cache requests, to determine if a tag for a received response to a cache request matches one or more pending queue entries, and, upon determining that the tag matches with any pending queue entry, to update the matching queue entry's physical address and status based on the received response; a second stage to update the pending queue entries, physical address, and status of each pending queue entry being updated based up on any match that is identified by the first stage; and a third stage to perform pending queue arbitration to select a cache request for processing. 4. The apparatus of claim 3 , wherein the incoming response pipeline further includes: a fourth stage to read out a virtual address from one of the one or more matching entries from the first stage for use in a TLB matching operation, wherein, for a miss in the TLB, a victim entry is to be selected for eviction; and a fifth stage to perform virtual address and physical address array update based on the TLB matching operation, wherein, for a hit on the TLB, a least recently used (LRU) value is updated and, for a miss in the TLB, a physical address and tag is written into the victim entry. 5. The apparatus of claim 4 , wherein: the first stage of the incoming response pipeline is to operate in a first clock cycle; the second and fourth stages of the incoming response pipeline are to operate in a second clock cycle; and the third and fifth stages of the incoming response pipeline are to operate in a third clock cycle. 6. The apparatus of claim 1 , wherein the received response to the cache request is derived from data stored in a lower level TLB. 7. A non-transitory computer-readable storage medium having stored thereon data representing sequences of instructions that, when executed by a processor, cause the processor to perform operations comprising: receiving and processing incoming translation requests, including a first incoming translation request with virtual address, at a graphics data port, the first incoming translation request being received from a context having a plurality of different page sizes including a first page size and a second page size, the first page size being larger than the second page size, the graphics data port including a translation lookaside buffer (TLB) and a data port controller to control the TLB; upon a miss in the TLB for the first incoming translation request, generating a cache request for the translation request; receiving and processing a response to the cache request; providing a fill entry allocation for the TLB upon receiving the response to the cache request, the response to the cache request including a certain page size of the plurality of page sizes, the page size of the response being unavailable to the graphics data port until the response to the cache request is received; and performing an invalidation process for one or more translation requests. 8. The medium of claim 7 , wherein receiving and processing the incoming translation requests includes: receiving the incoming translation requests, and, upon determining there is a hit in the TLB for an incoming translation request, determining a new least frequently used (LRU) value and physical address (PA) array index; writing the incoming translation requests into a pending queue in virtual address content addressable memory (CAM); and arbitrating pending queue entries in the pending queue that are ready to be scheduled, and selecting an oldest request that requires address translation for the cache request. 9. The medium of claim 8 , wherein receiving and processing responses to cache requests includes: receiving a response to a first cache request, determining if a tag for the received response matches one or more entries in the pending queue, and, upon determining that the tag matches with any pending queue entry, to update the matching entry's physical address and status based on the received response; updating the pending queue entries, the physical address and status of each pending queue entry being updated based upon any match that is identified with the tag for the response; and performing arbitration in the pending queue to select a cache request for processing. 10. The medium of claim 9 , wherein receiving and processing responses to cache requests further includes: reading out a virtual address from one of the one or more matching entries for use in a TLB matching operation, wherein, for a miss in the TLB, a victim entry is to be selected for eviction; and performing virtual address and physical address array update based on the TLB matching operation, wherein, for a hit on the TLB, a least recently used (LRU) value is updated and, for a miss in the TLB, a physical address and tag is written into the victim entry. 11. A method comprising: receiving and processing incoming translation requests, including a first incoming translation request with virtual address, at a graphics data port, the first incoming translation request being received from a context having a plurality of different page sizes including a first page size and a second page size, the first page size being larger than the second page size, the data port including a translation lookaside buffer (TLB) and a data port controller to control the TLB; upon a miss in the TLB for the first incoming translation request, generating a cache request for the translation request; receiving and processing a response to the cache request; providing a fill entry allocation for the TLB upon receiving the response to the cache request, the response to the cache request including a certain page size of the plurality of page sizes, the page size of the response bei
with dedicated cache, e.g. instruction or stack · CPC title
Image or video data · CPC title
General purpose rendering architectures · CPC title
according to data descriptor, e.g. dynamic data typing · CPC title
Details of translation look-aside buffer [TLB] · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.