Who is the assignee on this patent?

Microsoft Technology Licensing Llc

What technology area does this patent fall under?

Primary CPC classification G06F12/0808. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue May 28 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Low power multi-core coherency

US10303603B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10303603-B2
Application number	US-201715621870-A
Country	US
Kind code	B2
Filing date	Jun 13, 2017
Priority date	Jun 13, 2017
Publication date	May 28, 2019
Grant date	May 28, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A special class of loads and stores access a user-defined memory region where coherency and memory orders are only enforced at the coherent point. Coherent memory requests, which are limited to user-defined memory region, are dispatched to the common memory ordering buffer. Non-coherent memory requests (e.g., all other memory requests) can be routed via non-coherent lower level caches to the shared last level cache. By assigning a private, non-overlapping, address spaces to each of the processor cores, the lower-level caches do not need to implement the logic necessary to maintain cache coherency. This can reduce power consumption and integrated circuit die area. This can also improve memory bandwidth and performance for applications with predominantly non-coherent memory accesses while still providing memory coherence for specific memory range(s)/applications that demand it.

First claim

Opening claim text (preview).

What is claimed is: 1. An integrated circuit, comprising: a plurality of processor cores that share a common last-level cache, the plurality of processor cores each including a non-coherent memory order buffer, a first processor core being a one of the plurality of processor cores; and, a shared memory order buffer directly coupled to each of the plurality of processor cores such that coherent store transactions sent by the plurality of processor cores are directly received at the shared memory order buffer without being processed by at least one lower-level cache; the common last-level cache to receive store transactions sent by the non-coherent memory order buffers of the plurality of processor cores, the common last-level cache to also receive store transactions, from the shared memory order buffer, that correspond to the coherent store transactions sent by the plurality of processor cores. 2. The integrated circuit of claim 1 , wherein the store transactions sent by the non-coherent memory order buffers of the plurality of plurality of processor cores include store transactions that have been processed by the at least one lower-level cache before being sent to the last-level cache. 3. The integrated circuit of claim 1 , wherein the coherent store transactions sent by the plurality of processor cores are to be sent directly to the shared memory order buffer based at least in part on addresses targeted by the coherent store transactions being within a configured address range. 4. The integrated circuit of claim 1 , wherein the store transactions sent by the non-coherent memory order buffers are to be processed by the at least one lower-level cache before being sent to the last-level cache based at least in part on addresses targeted by the store transactions sent by the non-coherent memory order buffers being within a configured address range. 5. The integrated circuit of claim 1 , wherein the coherent store transactions sent by the plurality of processor cores are to be sent directly to the shared memory order buffer based at least in part on addresses targeted by the coherent store transactions being within an address range specified by at least one register that is writable by the first processor core. 6. The integrated circuit of claim 3 , wherein the configured address range corresponds to at least one memory page. 7. The integrated circuit of claim 4 , wherein the configured address range corresponds to at least one memory page. 8. A method of operating a processing system, comprising: receiving, from a plurality of processor cores, a plurality of non-coherent store transactions at a common last-level cache, a first processor core being one of the plurality of processor cores; receiving, from the plurality of processor cores, a plurality of coherent store transactions directly at a shared memory order buffer directly coupled to each of the plurality of processor cores; issuing, by the first processor core and directly to the shared memory order buffer, at least a first coherent store transaction, the first coherent store transaction to be processed by the shared memory order buffer before being sent to the last-level cache and without being processed by at least one lower-level cache; issuing, by the first processor core, at least a first non-coherent store transaction, the first non-coherent store transaction to be processed by the at least one lower-level cache before being sent to the last-level cache; and, receiving, at the last-level cache, the non-coherent store transaction and data stored by the coherent store transaction. 9. The method of claim 8 , wherein the first processor core issues the first coherent store transaction based on an address corresponding to the target of a store instruction being executed by the first processor core falling within a configured address range. 10. The method of claim 9 , wherein the configured address range corresponds to at least one memory page. 11. The method of claim 10 , wherein a page table entry associated with the at least one memory page includes an indicator that the first processor core is to issue the first coherent store transaction. 12. The method of claim 9 , further comprising: receiving, from a register written by a one of the plurality of processors, an indicator that corresponds to at least one limit of the configured address range. 13. The method of claim 8 , wherein the first processor core issues the first non-coherent store transaction based on an address corresponding to the target of a store instruction being executed by the first processor core falling within a configured address range. 14. The method of claim 13 , wherein the configured address range corresponds to at least one memory page. 15. The method of claim 14 , wherein a page table entry associated with the at least one memory page includes an indicator that the first processor core is to issue the first non-coherent store transaction. 16. The method of claim 11 , further comprising: receiving, from a register written by a one of the plurality of processors, an indicator that corresponds to at least one limit of the configured address range. 17. A processing system, comprising: a plurality of processing cores each coupled to at least a respective first level cache; a last-level cache, separate from the first level caches, to receive a block of non-coherent store data from the first level caches; a shared memory order buffer, directly coupled to each of the plurality of processing cores and to the last-level cache, to receive a block of coherent store data from a first processing core of the plurality of processing cores without the block of coherent store data being processed by the first level caches. 18. The processing system of claim 17 , wherein an address range determines whether the block of coherent store data is to be sent to the shared memory order buffer without being processed by the first level caches. 19. The processing system of claim 17 , wherein an indicator in a page table entry determines whether the block of coherent store data is to be sent to the shared memory order buffer without being processed by the first level caches. 20. The processing system of claim 17 , wherein an indicator in a page table entry determines whether the block of non-coherent store data is to be sent to the last-level cache without being processed by the shared memory order buffer.

Assignees

Microsoft Technology Licensing Llc

Inventors

Lai Patrick P

Classifications

G06F12/0831
using a bus scheme, e.g. with bus monitoring or watching means · CPC title
G06F12/0808Primary
with cache invalidating means (G06F12/0815 takes precedence) · CPC title
G06F12/10
Address translation · CPC title
G06F12/084
with a shared cache · CPC title
G06F12/0815
Cache consistency protocols · CPC title

Patent family

Related publications grouped by family.

View patent family 62621019

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10303603B2 cover?: A special class of loads and stores access a user-defined memory region where coherency and memory orders are only enforced at the coherent point. Coherent memory requests, which are limited to user-defined memory region, are dispatched to the common memory ordering buffer. Non-coherent memory requests (e.g., all other memory requests) can be routed via non-coherent lower level caches to the sh…
Who is the assignee on this patent?: Microsoft Technology Licensing Llc
What technology area does this patent fall under?: Primary CPC classification G06F12/0808. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue May 28 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).