Unified cache for diverse memory traffic

US10705994B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10705994-B2
Application numberUS-201715587213-A
CountryUS
Kind codeB2
Filing dateMay 4, 2017
Priority dateMay 4, 2017
Publication dateJul 7, 2020
Grant dateJul 7, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A unified cache subsystem includes a data memory configured as both a shared memory and a local cache memory. The unified cache subsystem processes different types of memory transactions using different data pathways. To process memory transactions that target shared memory, the unified cache subsystem includes a direct pathway to the data memory. To process memory transactions that do not target shared memory, the unified cache subsystem includes a tag processing pipeline configured to identify cache hits and cache misses. When the tag processing pipeline identifies a cache hit for a given memory transaction, the transaction is rerouted to the direct pathway to data memory. When the tag processing pipeline identifies a cache miss for a given memory transaction, the transaction is pushed into a first-in first-out (FIFO) until miss data is returned from external memory. The tag processing pipeline is also configured to process texture-oriented memory transactions.

First claim

Opening claim text (preview).

The claimed invention is: 1. A memory subsystem, comprising: a first memory; a first data pathway that services first-type memory transactions and provides direct access to the first memory; and a second data pathway that services second-type memory transactions of a first sub-type and second-type memory transactions of a second sub-type and includes: a first data sub-pathway that is coupled to the first data pathway and transports second-type memory transactions of the first sub-type to the first data pathway when a first condition occurs, and a second data sub-pathway that is coupled to the first memory and services second-type memory transactions of the second sub-type. 2. The memory subsystem of claim 1 , wherein the first memory comprises: a first sub-bank coupled to the first data pathway and to the first data sub-pathway, wherein the first sub-bank comprises a scratchpad memory for servicing first-type memory transactions and a cache memory for servicing second-type memory transactions of the first sub-type; and a second sub-bank coupled to the second data sub-pathway, wherein the second sub-bank comprises a texture cache for servicing second-type memory transactions of the second sub-type. 3. The memory subsystem of claim 1 , further comprising: a tag processing pipeline coupled to the second data pathway and including: a tag processor; a tag-to-data (T2D) first-in first-out (FIFO) buffer coupled to the second data pathway; and a tag store that includes a plurality of tags, wherein a first tag included in the plurality of tags indicates that a first data slot included in the first memory includes a first portion of data. 4. The memory subsystem of claim 3 , wherein the tag processor determines that the first condition has occurred and, in response, directs a first memory transaction to the first data pathway, the first condition comprises the first portion of data being resident in the first memory, and the first memory transaction is configured to access first portion of data. 5. The memory subsystem of claim 3 , wherein the tag processor determines that a second condition has occurred and, in response, pushes a first memory transaction onto the T2D FIFO buffer, the second condition comprises at least one of the first portion of data not being resident in the first memory and the first portion of data being in-flight to the first memory, and the first memory transaction is configured to access the first portion of data. 6. The memory subsystem of claim 3 , wherein the tag processor processes a first memory transaction that is configured to access the first portion of data by: incrementing a first reference counter associated with the first memory transaction to prevent the first portion of data from being evicted from the first data slot; accessing the first portion of data to service the first memory transaction; decrementing the first reference counter. 7. The memory subsystem of claim 6 , wherein the tag processor processes a second memory transaction that is configured to access a second portion of data by: evicting the first tag upon receiving the second memory transaction; decrementing the first reference counter to zero to allow the first portion of data to be evicted from the first data slot; evicting the first portion of data from the first data slot to allow the second portion of data to be stored in the first data slot. 8. The memory subsystem of claim 3 , wherein the tag processing pipeline further includes: a commit FIFO that stores a first memory transaction that is configured to access the first portion of data; an evict FIFO that stores a first evict transaction associated with the first portion of data, wherein the tag processing pipeline prevents the first portion of data from being evicted by the first data slot by: dequeueing the first evict transaction from the evict FIFO, determining that a first enqueue pointer associated with the first evict transaction matches an index associated with the first memory transaction that is stored in the commit FIFO, and stalling execution of the first evict transaction. 9. The memory subsystem of claim 3 , wherein the tag processing pipeline processes second-type memory transactions by: receiving a first memory transaction that is configured to access the first portion of data and is of the first sub-type; servicing the first memory transaction; setting a first rank associated with the first data slot to a first value corresponding to the first sub-type; receiving a second memory transaction that is configured to access a second portion of data included in a second data slot and is of the second sub-type; servicing the second memory transaction; and setting a second rank associated with the second data slot to a second value corresponding to the second sub-type, wherein the second value exceeds the first value and indicates that the second data slot has a higher priority for eviction than the first data slot. 10. The memory subsystem of claim 9 , wherein the tag processing pipeline further processes second-type memory transactions by: receiving a third memory transaction targeting the second portion of data; servicing the third memory transaction; incrementing the first rank; setting the second rank to zero; and evicting the first portion of data from the first data slot based on the first rank exceeding the second rank. 11. The memory subsystem of claim 1 , further comprising a write-back buffer that: receives a first subset of load data that traverses the first data sub-pathway and is associated with a first latency; receives a second subset of load data that traverses the second data sub-pathway and is associated with a second latency; receives a token associated with the second subset of load data, wherein the token indicates that all subsets of load data have been received; and upon receiving the token, writes the load data back to a processor. 12. A computer-implemented method for servicing memory transactions, the method comprising: receiving a first memory transaction; determining a type associated with the first memory transaction; when the first memory transaction has a first type, routing the first memory transaction across a first data pathway that provides direct access to a first memory; and when the first memory transaction has a second type: routing the first memory transaction across a second data pathway that includes a first data sub-pathway and a second data sub-pathway, determining a sub-type associated with the first memory transaction, and routing the first memory transaction across one of the first data sub-pathway and the second data sub-pathway based on the sub-type. 13. The computer-implemented method of claim 12 , further comprising: routing the second-type memory transaction across the first data sub-pathway when the second-type memory transaction has the first sub-type, wherein the first data sub-pathway transports memory transactions associated with the second type and the first sub-type to the first data pathway when a first condition occurs, or routing the second-type memory transaction across the second data sub-pathway when the second-type memory transaction has the second sub-type, wherein the second data sub-pathway transports memory transactions associated with the second type and the second sub-type when a second condition occurs. 14. The computer-implemented method of claim 13 , further comprising: determining that the first condition has occurred; and in response, directing the first memory transaction to the first data pathway, the second condition com

Assignees

Inventors

Classifications

  • of parts of caches, e.g. directory or tag array · CPC title

  • of the least frequently used [LFU] type, e.g. with individual count value · CPC title

  • Details of cache specific to multiprocessor cache arrangements · CPC title

  • DMA · CPC title

  • using clearing, invalidating or resetting means · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10705994B2 cover?
A unified cache subsystem includes a data memory configured as both a shared memory and a local cache memory. The unified cache subsystem processes different types of memory transactions using different data pathways. To process memory transactions that target shared memory, the unified cache subsystem includes a direct pathway to the data memory. To process memory transactions that do not targ…
Who is the assignee on this patent?
Nvidia Corp
What technology area does this patent fall under?
Primary CPC classification G06F12/0895. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 07 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).