Unified cache for diverse memory traffic

US11347668B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11347668-B2
Application numberUS-202016921795-A
CountryUS
Kind codeB2
Filing dateJul 6, 2020
Priority dateMay 4, 2017
Publication dateMay 31, 2022
Grant dateMay 31, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A unified cache subsystem includes a data memory configured as both a shared memory and a local cache memory. The unified cache subsystem processes different types of memory transactions using different data pathways. To process memory transactions that target shared memory, the unified cache subsystem includes a direct pathway to the data memory. To process memory transactions that do not target shared memory, the unified cache subsystem includes a tag processing pipeline configured to identify cache hits and cache misses. When the tag processing pipeline identifies a cache hit for a given memory transaction, the transaction is rerouted to the direct pathway to data memory. When the tag processing pipeline identifies a cache miss for a given memory transaction, the transaction is pushed into a first-in first-out (FIFO) until miss data is returned from external memory. The tag processing pipeline is also configured to process texture-oriented memory transactions.

First claim

Opening claim text (preview).

The claimed invention is: 1. A memory subsystem, comprising: a first memory; and address logic that routes memory transactions that target scratchpad memory along a first data pathway directly to the first memory, and routes memory transactions that do not target the scratchpad memory either along the first data pathway directly to the first memory or along a second data pathway to a second memory depending on whether data associated with the memory transactions that do not target the scratchpad memory is resident in the first memory. 2. The memory subsystem of claim 1 , further comprising a third memory that stores tags, wherein the address logic determines whether the data associated with the memory transactions that do not target the scratchpad memory is resident in the first memory based on whether tags associated with the memory transactions that do not target the scratchpad memory are stored in the third memory. 3. The memory subsystem of claim 1 , wherein, when the data associated with the memory transactions that do not target the scratchpad memory is not resident in the first memory, the address logic stores the memory transactions that do not target the scratchpad memory in a tag-to-data first-in first-out buffer until the data associated with the memory transactions that do not target the scratchpad memory is returned to the first memory. 4. The memory subsystem of claim 1 , wherein the address logic routes texture-oriented memory transactions from a texture processing pipeline along the second pathway, and data is read from the first memory via a crossbar that is shared between at least one processing core and the texture processing pipeline. 5. The memory subsystem of claim 1 , wherein the first memory comprises a plurality of sub-banks, and the address logic coalesces two or more memory transactions that target a same sub-bank. 6. The memory subsystem of claim 1 , further comprising a processor that appends a token to a memory transaction that does not target the scratchpad memory when data associated with the memory transaction that does not target the scratchpad memory is not resident in the first memory. 7. The memory subsystem of claim 1 , further comprising a data scrubber that maintains data slot availability in the first memory above a predefined threshold. 8. The memory subsystem of claim 7 , wherein the data scrubber stores tags in a third memory to cause corresponding data slots in the first memory to be allocated, and evicts tags stored in the third memory to cause data stored in corresponding data slots in the first memory to be evicted. 9. The memory subsystem of claim 7 , wherein the data scrubber increments an age counter associated with a class to cause data associated with the class that is stored in the first memory to be evicted when a request is received to access the data associated with the class. 10. A system, comprising: a streaming multiprocessor that generates memory transactions that target scratchpad memory and memory transactions that do not target the scratchpad memory; and a unified cache comprising a first memory, wherein the unified cache is coupled to the streaming multiprocessor and services the memory transactions that target the scratchpad memory and the memory transactions that do not target the scratchpad memory by performing the steps of: routing the memory transactions that target the scratchpad memory along a first data pathway directly to the first memory, and routing the memory transactions that do not target the scratchpad memory either along the first data pathway directly to the first memory or along a second data pathway to a second memory depending on whether data associated with the memory transactions that do not target the scratchpad memory is resident in the first memory. 11. The system of claim 10 , wherein the unified cache further comprises a third memory that stores tags, and the unified cache determines whether the data associated with the memory transactions that do not target the scratchpad memory is resident in the first memory based on whether tags associated with the memory transactions that do not target the scratchpad memory are stored in the third memory. 12. The system of claim 10 , wherein, when the data associated with the memory transactions that do not target the scratchpad memory is not resident in the first memory, the unified cache stores the memory transactions that do not target the scratchpad memory in a tag-to-data first-in first-out buffer until the data associated with the memory transactions that do not target the scratchpad memory is returned to the first memory. 13. The system of claim 10 , wherein the unified cache routes texture-oriented memory transactions from a texture processing pipeline along the second pathway, and data is read from the first memory via a crossbar that is shared between at least one processing core and the texture processing pipeline. 14. The system of claim 10 , wherein the first memory comprises a plurality of sub-banks, and the unified cache coalesces two or more memory transactions that target a same sub-bank. 15. The system of claim 10 , wherein the unified cache further comprises a data scrubber that maintains data slot availability in the first memory above a predefined threshold. 16. The system of claim 15 , wherein the data scrubber stores tags in a third memory to cause corresponding data slots in the first memory to be allocated, and evicts tags stored in the third memory to cause data stored in corresponding data slots in the first memory to be evicted. 17. The system of claim 15 , wherein the data scrubber increments an age counter associated with a class to cause data associated with the class that is stored in the first memory to be evicted when a request is received to access the data associated with the class. 18. The system of claim 10 , wherein the system comprises a computing device. 19. The system of claim 18 , wherein the computing device comprises one of a desktop computer, a laptop computer, a handheld personal computer, a handheld device, a server, a workstation, a game console, or an embedded system. 20. A computer-implemented method for servicing memory transactions, the method comprising: receiving memory transactions that target scratchpad memory and memory transactions that do not target the scratchpad memory; routing the memory transactions that target the scratchpad memory along a first data pathway directly to a first memory; and routing the memory transactions that do not target the scratchpad memory either along the first data pathway directly to the first memory or along a second data pathway to a second memory depending on whether data associated with the memory transactions that do not target the scratchpad memory is resident in the first memory.

Assignees

Inventors

Classifications

  • of parts of caches, e.g. directory or tag array · CPC title

  • Read-write modes for single port memories, i.e. having either a random port or a serial port · CPC title

  • Details of cache specific to multiprocessor cache arrangements · CPC title

  • with a shared cache · CPC title

  • of the least frequently used [LFU] type, e.g. with individual count value · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11347668B2 cover?
A unified cache subsystem includes a data memory configured as both a shared memory and a local cache memory. The unified cache subsystem processes different types of memory transactions using different data pathways. To process memory transactions that target shared memory, the unified cache subsystem includes a direct pathway to the data memory. To process memory transactions that do not targ…
Who is the assignee on this patent?
Nvidia Corp
What technology area does this patent fall under?
Primary CPC classification G06F12/0895. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 31 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).