What technology area does this patent fall under?

Primary CPC classification G06F9/3834. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Apr 20 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Cache coherence validation using delayed fulfillment of l2 requests

US2023122466A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2023122466-A1
Application number	US-202117506122-A
Country	US
Kind code	A1
Filing date	Oct 20, 2021
Priority date	Oct 20, 2021
Publication date	Apr 20, 2023
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and systems for validating cache coherence in a data processing system are described. A processing element may detect a load instruction requesting the processing element to transfer data from a global memory location to a local memory location. The processing element may apply, in response to detecting the load instruction requesting the processing element to transfer data from the global memory location to the local memory location, a delay to the transfer of the data from the global memory location to the local memory location. The processing element may execute the load instruction and transferring the data from the global memory location to the local memory location with the applied delay. The processing element may validate, in response to executing the load instruction and transferring the data with the applied delay, a cache coherence of the data processing system.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method for validating cache coherence in a data processing system, the method comprising: detecting a load instruction requesting a processing element to transfer data from a global memory location to a local memory location, the processing element being among a plurality of processing elements of the data processing system; applying, in response to detecting the load instruction requesting the processing element to transfer data from the global memory location to the local memory location, a delay to the transfer of the data from the global memory location to the local memory location; executing the load instruction and transferring the data from the global memory location to the local memory location with the applied delay; and validating, in response to executing the load instruction and transferring the data with the applied delay, a cache coherence of the data processing system. 2 . The method of claim 1 , wherein: the global memory location is a memory location in a level two (L2) cache connected to the plurality of processing elements in the data processing system; and the local memory location is a memory location in a level one (L1) cache of the processing element. 3 . The method of claim 1 , wherein applying the delay to the transfer of data increases a window of opportunity for another processing element to make changes relating to the global memory location, wherein the changes relating to the global memory location creates an out-of-order hazard for the data processing system. 4 . The method of claim 1 , wherein the load instruction is a first load instruction, and applying the delay causes the execution of the first load instruction to be completed later than an execution of a second load instruction, the second load instruction being younger than the first load instruction, and the second load instruction has a target that overlaps with a target of the first load instruction. 5 . The method of claim 4 , wherein validating the cache coherence of the data processing system comprises: in response to completing the execution of the first load instruction, detecting whether the second load instruction is being re-executed; in response to the second load instruction being re-executed, determining the cache coherence of the data processing system is successful; and in response to the second load instruction not being re-executed, determining the cache coherence of the data processing system failed. 6 . The method of claim 1 , wherein the global memory location is a cache line assigned to undergo the delay. 7 . The method of claim 1 , wherein the delay is based on a random number. 8 . The method of claim 1 , wherein the delay is based on a distance between the processing element and other processing elements in the data processing system. 9 . A computing system comprising: a first processing element; a second processing element; an interconnect connected to the first processing element and the second processing element; the first processing element being configured to: detect a load instruction requesting a processing element to transfer data from a global memory location to a local memory location, the global memory location being accessible by the first processing element and the second processing element, and the local memory location being accessible by the first processing element; apply, in response to detecting the load instruction requesting the first processing element to transfer data from the global memory location to the local memory location, a delay to the transfer of the data from the global memory location to the local memory location; execute the load instruction and transferring the data from the global memory location to the local memory location with the applied delay; and validate, in response to the execution of the load instruction and transferring the data with the applied delay, a cache coherence of the computing system. 10 . The computing system of claim 9 , further comprising a level two (L2) cache connected to the first processing element and the second processing element, wherein: the global memory location is a memory location in the L2 cache; and the local memory location is a memory location in a level one (L1) cache of the first processing element. 11 . The computing system of claim 9 , wherein the application of the delay to the transfer of data increases a window of opportunity for the second processing element to make changes relating to the global memory location, the changes relating to the global memory location creates an out-of-order hazard for the data processing system. 12 . The computing system of claim 9 , wherein the load instruction is a first load instruction, and the application of the delay causes the execution of the first load instruction to be completed later than an execution of a second load instruction, the second load instruction being younger than the first load instruction, and the second load instruction has a target that overlaps with a target of the first load instruction. 13 . The computing system of claim 12 , wherein the first processing element is configured to: in response to completing the execution of the first load instruction, detect whether the second load instruction is being re-executed; in response to the second load instruction being re-executed, determine the cache coherence of the data processing system is successful; and in response to the second load instruction not being re-executed, determine the cache coherence of the data processing system failed. 14 . The computing system of claim 9 , wherein the global memory location is a cache line assigned to undergo the delay. 15 . The computing system of claim 9 , wherein the delay is based on a random number. 16 . The computing system of claim 9 , wherein the delay is based on a distance between the first processing element and other processing elements in the computing system. 17 . A processing element comprising: a processor pipeline comprising one or more load store units (LSUs) configured to execute load and store instructions, the one or more LSUs being configured to: detect a load instruction requesting the processing element to transfer data from a global memory location to a local memory location, the processing element being among a plurality of processing elements of the data processing system; apply, in response to detecting the load instruction requesting the processing element to transfer data from the global memory location to the local memory location, a delay to the transfer of the data from the global memory location to the local memory location; execute the load instruction and transferring the data from the global memory location to the local memory location with the applied delay; and validate, in response to executing the load instruction and transferring the data with the applied delay, a cache coherence of a data processing system including the processing element. 18 . The processing element of claim 17 , wherein: the global memory location is a memory location in a level two (L2) cache connected to the plurality of processing elements in the data processing system; and the local memory location is a memory location in a level one (L1) cache of the processing element. 19 . The processing element of claim 17 , wherein the application of the delay to the transfer of data increases a window of opportunity for another processing element to make changes relating to the global memory locat

Assignees

Inventors

Classifications

G06F9/3851
from multiple instruction streams, e.g. multistreaming · CPC title
G06F9/3836
Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution · CPC title
G06F9/3834Primary
Maintaining memory consistency · CPC title
G06F9/3838
Dependency mechanisms, e.g. register scoreboarding · CPC title
G06F12/0811
with multilevel cache hierarchies · CPC title

Patent family

Related publications grouped by family.

View patent family 85981195

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2023122466A1 cover?: Methods and systems for validating cache coherence in a data processing system are described. A processing element may detect a load instruction requesting the processing element to transfer data from a global memory location to a local memory location. The processing element may apply, in response to detecting the load instruction requesting the processing element to transfer data from the glo…
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification G06F9/3834. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Apr 20 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Exposing and reproducing software race conditions

Streaming stress testing of cache memory

Efficient validation/verification of coherency and snoop filtering mechanisms in computing systems

Method to efficiently trigger concurrency bugs based on expected frequencies of execution interleavings

Detecting missing write to cache/memory operations

Frequently asked questions