What technology area does this patent fall under?

Primary CPC classification G06F12/0831. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Feb 07 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Management of coherent links and multi-level memory

US2019042425A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2019042425-A1
Application number	US-201815948569-A
Country	US
Kind code	A1
Filing date	Apr 9, 2018
Priority date	Apr 9, 2018
Publication date	Feb 7, 2019
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques for managing multi-level memory and coherency using a unified page granular controller can simplify software programming of both file system handling for persistent memory and parallel programming of host and accelerator and enable better software utilization of host processors and accelerators. As part of the management techniques, a line granular controller cooperates with a page granular controller to support both fine grain and coarse grain coherency and maintain overall system inclusion property. In one example, a controller to manage coherency in a system includes a memory data structure and on-die tag cache to store state information to indicate locations of pages in a memory hierarchy and an ownership state for the pages, the ownership state indicating whether the pages are owned by a host processor, owned by an accelerator device, or shared by the host processor and the accelerator device. The controller can also include logic to, in response to a memory access request from the host processor or the accelerator to access a cacheline in a page in a state indicating ownership by a device other than the requesting device, cause the page to transition to a state in which the requesting device owns or shares the page.

First claim

Opening claim text (preview).

What is claimed is: 1 . An apparatus to manage coherency in a system, the apparatus comprising: hardware storage to store information to indicate locations of pages in a memory hierarchy and an ownership state for the pages, the ownership state indicating whether the pages are owned by a host processor, owned by an accelerator device, or shared by the host processor and the accelerator device; and logic to: in response to a memory access request from the host processor or the accelerator to access a cacheline in a page in a state indicating ownership by a device other than the requesting device, cause the page to transition to a state in which the requesting device owns or shares the page. 2 . The apparatus of claim 1 , wherein the logic is to allow an access from a device to a page when the page is in a state indicating the device owns or shares the page. 3 . The apparatus of claim 1 , wherein: the hardware storage comprises on-die or on-package storage to store the information to indicate locations of pages in the memory hierarchy and the ownership state for the pages. 4 . The apparatus of claim 1 , wherein the logic is to: in response to a memory request from the host processor or the accelerator to access a cacheline in a page that resides in far memory or remote memory, cause the page to be allocated to a near memory cache. 5 . The apparatus of claim 4 , wherein the logic is to: in response to an access that hits a full set in the near memory cache, de-allocate a least recently used victim page from the near memory cache and write modified data of the victim page to the far memory or the remote memory. 6 . The apparatus of claim 1 , wherein: the memory hierarchy includes a near memory cache and a far memory; wherein the information indicates locations in the near memory cache for far memory pages; and wherein the far memory is to store pages owned by the host processor, pages owned by the accelerator device, and shared memory pages. 7 . The apparatus of claim 1 , wherein: the memory hierarchy includes byte-addressable persistent memory; and wherein the persistent memory is to store pages owned by the host processor, pages owned by the accelerator device, and shared memory pages. 8 . The apparatus of claim 3 , wherein: the near memory cache comprises volatile memory; and wherein the far memory comprises non-volatile byte addressable storage. 9 . The apparatus of claim 6 , wherein: the logic is to cause the information to be stored to the hardware storage, to a structure in near memory, and to a structure in the far memory; wherein the hardware storage is to store the information for recently accessed pages, the structure in the near memory is to store information for pages allocated to the near memory cache, and the structure in the far memory is to store information for all memory pages; and wherein the information to be stored in the hardware storage, the structure in the near memory, and the structure in the far memory is to indicate locations in the near memory cache for far memory pages and the ownership state. 10 . The apparatus of claim 6 , wherein: the memory hierarchy includes a memory coupled with the accelerator device; wherein the state information indicates locations for pages stored in the memory coupled with the accelerator device; and wherein the memory coupled with the accelerator device is to store pages owned by the host processor, pages owned by the accelerator device, and shared memory pages. 11 . The apparatus of claim 1 , wherein: the state information is to further indicate whether copies of cachelines of the page is to be in one or more of: a host processor-side cache, a near memory cache, a filtered portion of an accelerator-side cache that is tracked in a host processor-side snoop filter, and a non-filtered portion of an accelerator-side cache that is not tracked in the host processor-side snoop filter. 12 . The apparatus of claim 1 , wherein: the hardware storage is to store one or more bits to indicate whether the page is mapped to a domain or shared by multiple domains; and wherein domains include: a first domain to indicate a page is owned by the host processor and a second domain to indicate a page is owned by the accelerator device. 13 . The apparatus of claim 12 , wherein the system includes multiple accelerator devices, and wherein the domains include domains for groups of accelerator devices or a single domain for the multiple accelerator devices. 14 . The apparatus of claim 12 , wherein the system includes multiple host processors, and wherein the domains include domains for groups of host processors or a single domain for the multiple host processors. 15 . The apparatus of claim 12 , wherein the logic to cause a page to transition to another state is to: update the state information for the page in the hardware storage; and cause a cache flush of any cachelines in the page having copies in a cache that is not mapped to the domain being transitioned to. 16 . The apparatus of claim 15 , wherein the logic to cause a page to transition to another state is to: update the information to indicate location and ownership state in a structure stored in memory. 17 . The apparatus of claim 1 , wherein the logic is to: receive a snoop filter miss to access a cacheline in a page; and in response to receipt of the snoop filter miss, determine a state of the page based on the stored state information. 18 . The apparatus of claim 15 , wherein the logic is to: in response to transition of the page to a state indicating ownership by the host processor or a shared state, cause one or more cachelines in the page to be allocated in a host processor-side snoop filter; and in response to transition of the page to a state indicating ownership by the accelerator device, cause cachelines in the page to not be allocated in the host processor-side snoop filter. 19 . The apparatus of claim 1 , wherein the logic is to: in response to detection of concurrent memory access requests from both the host processor and the accelerator to access cachelines in a same page, cause the page to transition to a state in which the host processor and the accelerator share the page. 20 . The apparatus of claim 19 , wherein the logic is to: in response to the detection of concurrent memory access requests to access cachelines in the same page, store information indicating a conflict for the page. 21 . The apparatus of claim 20 , wherein the logic is to: store the information indicating the conflict for the page comprises allocating the page in a translation lookaside buffer (TLB) or FIFO (first in first out) of recent page conflicts; and in response to eviction of the page from the TLB or FIFO, cause a transition back to the page's pre-conflict state or other pre-defined conflict exit state. 22 . The apparatus of claim 21 , wherein the logic is to: in response to a determination that the page is in the TLB or FIFO, determine the page is in a shared state; and in response to determination that the page is not in the TLB or FIFO, determine the state of the page based on the stored state information for the page. 23 . The apparatus of claim 21 , wherein the logic is to: de-allocate a page from the TLB or FIFO in response to detection of one or more conditions including: detection that the page is evicted from a near memory cache, and for a TLB, a determination

Assignees

Intel Corp

Inventors

Shifer Eran

Classifications

G06F12/0811
with multilevel cache hierarchies · CPC title
G06F12/0831Primary
using a bus scheme, e.g. with bus monitoring or watching means · CPC title
G06F12/1009
using page tables, e.g. page table structures · CPC title
G06F12/123
with age lists, e.g. queue, most recently used [MRU] list or least recently used [LRU] list · CPC title
G06F12/1027
using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB] · CPC title

Patent family

Related publications grouped by family.

View patent family 65229664

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2019042425A1 cover?: Techniques for managing multi-level memory and coherency using a unified page granular controller can simplify software programming of both file system handling for persistent memory and parallel programming of host and accelerator and enable better software utilization of host processors and accelerators. As part of the management techniques, a line granular controller cooperates with a page g…
Who is the assignee on this patent?: Intel Corp
What technology area does this patent fall under?: Primary CPC classification G06F12/0831. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Feb 07 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).