Coarse grain coherency

US10949945B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10949945-B2
Application numberUS-202016872046-A
CountryUS
Kind codeB2
Filing dateMay 11, 2020
Priority dateApr 9, 2017
Publication dateMar 16, 2021
Grant dateMar 16, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

One embodiment provides for a general-purpose graphics processing device comprising a general-purpose graphics processing compute block to process a workload including graphics or compute operations, a first cache memory, and a coherency module enable the first cache memory to coherently cache data for the workload, the data stored in memory within a virtual address space, wherein the virtual address space shared with a separate general-purpose processor including a second cache memory that is coherent with the first cache memory.

First claim

Opening claim text (preview).

What is claimed is: 1. An electronic device comprising: a general-purpose processor including a first cache memory and a first coherency module; and a general-purpose graphics processor including a second cache memory and a second coherency module, wherein the first coherency module and the second coherency module enable heterogeneous coherency between the first cache memory and the second cache memory, the heterogeneous coherency enabled at multiple cache line granularity; and one or more memory controllers coupled with the general-purpose processor and the general-purpose graphics processor, the one or more memory controllers to enable communication with a memory module to store a superline directory table, wherein the superline directory table is to track ownership for a superline owned by the general-purpose processor and the general-purpose graphics processor, wherein the superline is a sub-page address region that spans multiple cache lines of the first cache memory and the second cache memory. 2. The electronic device as in claim 1 , wherein data storage for the first cache memory is managed at cache line granularity, coherence for sub-page shared virtual memory allocations cached by the first cache memory, and the second cache memory are managed at superline granularity. 3. The electronic device as in claim 2 , wherein the first cache memory is a level 3 cache memory. 4. The electronic device as in claim 3 , wherein the second cache memory is a last level cache coupled with the general-purpose processor and the general-purpose graphics processor. 5. The electronic device as in claim 4 , wherein the general-purpose graphics processor includes a superline ownership table to store a set of superlines owned by the general-purpose graphics processor, wherein the superline ownership table includes an entry for each superline in the set of superlines owned by the general-purpose graphics processor and each entry in the superline ownership table includes a superline tag and a coherency protocol status for the superline. 6. The electronic device as in claim 5 , wherein the coherency protocol status is one of modified, exclusive, shared, or invalid and each entry in the superline ownership table additionally includes a valid bit for each cache line within the superline. 7. The electronic device as in claim 1 , wherein the general-purpose graphics processor additionally includes a graphics processing compute block including multiple graphics multiprocessors. 8. The electronic device as in claim 7 , wherein multiple graphics multiprocessors are to process a workload including graphics or compute operations. 9. The electronic device as in claim 8 , wherein the workload is a heterogeneous workload including operations to be performed by the general-purpose graphics processing compute block and the general-purpose processor. 10. The electronic device as in claim 9 , wherein the operations of the workload are to access a unified memory address space and the unified memory address space includes system memory and graphics processor memory. 11. The electronic device as in claim 10 , wherein the general-purpose graphics processor is an add-in card connected to the general-purpose processor via a system bus and the add-in card includes the graphics processor memory. 12. A data processing system comprising: a general-purpose processor including a first cache memory and a first coherency module; a system bus coupled with the general-purpose processor; and a general-purpose graphics processor coupled with general-purpose processor via the system bus, the general-purpose graphics processor including a second cache memory and a second coherency module, wherein the first coherency module and the second coherency module enable heterogeneous coherency between the first cache memory and the second cache memory, the heterogeneous coherency enabled at multiple cache line granularity; and a memory module to store a superline directory table, wherein the superline directory table is to track ownership for a superline owned by the general-purpose processor and the general-purpose graphics processor, wherein the superline is a sub-page address region that spans multiple cache lines of the first cache memory and the second cache memory. 13. The data processing system as in claim 12 , wherein data storage for the first cache memory is managed at cache line granularity, coherence for sub-page shared virtual memory allocations cached by the first cache memory, and the second cache memory are managed at superline granularity. 14. The data processing system as in claim 13 , wherein the first cache memory is a level 3 cache memory. 15. The data processing system as in claim 14 , wherein the second cache memory is a last level cache coupled with the general-purpose processor and the general-purpose graphics processor. 16. The data processing system as in claim 15 , wherein the general-purpose graphics processor includes a superline ownership table to store a set of superlines owned by the general-purpose graphics processor, wherein the superline ownership table includes an entry for each superline in the set of superlines owned by the general-purpose graphics processor and each entry in the superline ownership table includes a superline tag and a coherency protocol status for the superline. 17. The data processing system as in claim 16 , wherein the coherency protocol status is one of modified, exclusive, shared, or invalid and each entry in the superline ownership table additionally includes a valid bit for each cache line within the superline. 18. The data processing system as in claim 12 , wherein the general-purpose graphics processor additionally includes a graphics processing compute block including multiple graphics multiprocessors. 19. The data processing system as in claim 18 , wherein multiple graphics multiprocessors are to process a workload including graphics or compute operations. 20. The data processing system as in claim 19 , wherein the workload is a heterogeneous workload including operations to be performed by the general-purpose graphics processing compute block and the general-purpose processor, wherein the operations of the workload are to access a unified memory address space including system memory and graphics processor memory.

Assignees

Inventors

Classifications

  • Details of virtual memory and virtual address translation · CPC title

  • Memory management · CPC title

  • Virtual address space management · CPC title

  • Coherency control relating to peripheral accessing, e.g. from DMA or I/O device · CPC title

  • In image processor or graphics adapter · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10949945B2 cover?
One embodiment provides for a general-purpose graphics processing device comprising a general-purpose graphics processing compute block to process a workload including graphics or compute operations, a first cache memory, and a coherency module enable the first cache memory to coherently cache data for the workload, the data stored in memory within a virtual address space, wherein the virtual a…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06T1/20. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 16 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).