Coordinated thread criticality-aware memory scheduling

US9921839B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-9921839-B1
Application numberUS-201615275066-A
CountryUS
Kind codeB1
Filing dateSep 23, 2016
Priority dateSep 23, 2016
Publication dateMar 20, 2018
Grant dateMar 20, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A multi-core processor includes a plurality of cores to execute a plurality of threads and to monitor metrics for each of the plurality of threads during an interval, the metrics including stall cycle values, prefetches of a first type, and prefetches of a second type. The multi-core processor further includes criticality-aware thread prioritization (CATP) logic to compute a stall fraction for each of the plurality of threads during the interval using the stall cycle values, identify a thread with a highest stall fraction of the plurality of threads, determine the highest stall fraction is greater than a stall threshold, prioritize demand requests of the identified thread, compute a prefetch accuracy of the identified thread during the interval using the prefetches of the first type and the prefetches of the second type, determine the prefetch accuracy is greater than a prefetch threshold, and prioritize prefetch requests of the identified thread.

First claim

Opening claim text (preview).

What is claimed is: 1. A multi-core processor comprising: a plurality of cores to execute a plurality of threads and to monitor metrics for each of the plurality of threads during an interval, the metrics comprising stall cycle values, a first number of prefetches of a first type, and a second number of prefetches of a second type; and criticality-aware thread prioritization (CATP) logic to: compute a stall fraction for each of the plurality of threads during the interval using the stall cycle values; identify a thread from the plurality of threads with a highest stall fraction of the plurality of threads; determine the highest stall fraction is greater than a stall threshold; prioritize demand requests of the identified thread; compute a prefetch accuracy of the identified thread during the interval using the first number of prefetches of the first type and the second number of prefetches of the second type; determine the prefetch accuracy is greater than a prefetch threshold; and prioritize prefetch requests of the identified thread. 2. The multi-core processor of claim 1 , wherein the CATP logic comprises: first logic block to compute the stall fraction; second logic block to compute the prefetch accuracy; third logic block to prioritize the demand requests of the identified thread; and fourth logic block to prioritize the prefetch requests of the identified thread. 3. The multi-core processor of claim 2 , wherein the first logic block and the second logic block reside in each of the plurality of cores. 4. The multi-core processor of claim 2 further comprising a memory controller, wherein the first logic block and the second logic block reside in each of the plurality of cores, wherein the third logic block and the fourth logic block reside in the memory controller. 5. The multi-core processor of claim 2 further comprising a memory controller, wherein the first logic block, the second logic block, the third logic block, and the fourth logic block reside in the memory controller. 6. The multi-core processor of claim 1 , wherein: the prefetch accuracy is a ratio of the first number of prefetches of the first type to a sum of the first number of prefetches of the first type and the second number of prefetches of the second type; prefetches of the first type is when corresponding data was brought into an L2 cache from main memory and the corresponding data was used by a subsequent demand request; and prefetches of the second type is when corresponding data was brought into the L2 cache from the main memory and the corresponding data was evicted without being used. 7. The multi-core processor of claim 1 , wherein the stall fraction of a corresponding thread is a ratio of the stall cycles of the corresponding thread to a plurality of stall cycles of the plurality of threads. 8. The multi-core processor of claim 1 , wherein the CATP logic to prioritize the demand requests of the identified thread comprises processing the demand requests of the identified thread prior to processing a plurality of demand requests from the plurality of threads, wherein the CATP logic to prioritize the prefetch requests of the identified thread comprises processing the prefetch requests of the identified thread after processing the demand requests of the identified thread and prior to processing a plurality of prefetch requests from the plurality of threads. 9. A method comprising: executing, by a plurality of cores, a plurality of threads; monitoring, by the plurality of cores, metrics for each of the plurality of threads during an interval, the metrics comprising stall cycle values, a first number of prefetches of a first type, and a second number of prefetches of a second type; and computing, by a first logic block of criticality-aware thread prioritization (CATP) logic, a stall fraction for each of the plurality of threads during the interval using the stall cycle values; identifying, by the CATP logic, a thread from the plurality of threads with a highest stall fraction of the plurality of threads; determining, by the CATP logic, the highest stall fraction is greater than a stall threshold; prioritizing, by a third logic block of the CATP logic, demand requests of the identified thread; computing, by a second logic block of the CATP logic, a prefetch accuracy of the identified thread during the interval using the first number of prefetches of the first type and the second number of prefetches of the second type; determining, by the CATP logic, the prefetch accuracy is greater than a prefetch threshold; and prioritizing, by a fourth logic block of the CATP logic, prefetch requests of the identified thread. 10. The method of claim 9 , wherein the first logic block and the second logic block reside in each of the plurality of cores. 11. The method of claim 9 , wherein the first logic block and the second logic block reside in each of the plurality of cores, wherein the third logic block and the fourth logic block reside in a memory controller. 12. The method of claim 9 , wherein the first logic block, the second logic block, the third logic block, and the fourth logic block reside in a memory controller. 13. The method of claim 9 , wherein: the prefetch accuracy is a ratio of the first number of prefetches of the first type to a sum of the first number of prefetches of the first type and the second number of prefetches of the second type; prefetches of the first type is when corresponding data was brought into an L2 cache from main memory and the corresponding data was used by a subsequent demand request; and prefetches of the second type is when corresponding data was brought into the L2 cache from the main memory and the corresponding data was evicted without being used. 14. The method of claim 9 , wherein the stall fraction of a corresponding thread is a ratio of the stall cycles of the corresponding thread to a plurality of stall cycles of the plurality of threads. 15. The method of claim 9 , wherein: the prioritizing of the demand requests of the identified thread comprises processing the demand requests of the identified thread prior to processing a plurality of demand requests from the plurality of threads; and the prioritizing of the prefetch requests of the identified thread comprises processing the prefetch requests of the identified thread after processing the demand requests of the identified thread and prior to processing a plurality of prefetch requests from the plurality of threads. 16. A system comprising: a main memory to receive a plurality of demand requests and a plurality of prefetch requests from a plurality of threads; and a multi-core processor coupled to the main memory, the multi-core processor comprising: a plurality of cores to execute the plurality of threads and to monitor metrics for each of the plurality of threads during an interval, the metrics comprising stall cycle values, a first number of prefetches of a first type, and a second number of prefetches of a second type; and criticality-aware thread prioritization (CATP) logic to: compute, by a first logic block of the CATP logic, a stall fraction for each of the plurality of threads during the interval using the stall cycle values; identify a thread from the plurality of threads with a highest stall fraction of the plurality of threads; determine the highest stall fraction is greater than a stall threshold; prioritize, by a third logic block of the CATP logic, demand requests of the identified thread; compute, by a second logic block of the CATP logic, a prefetch accuracy of the iden

Assignees

Inventors

Classifications

  • with prefetch · CPC title

  • Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues · CPC title

  • Details relating to cache prefetching · CPC title

  • G06F9/3009Primary

    Thread control instructions · CPC title

  • Prefetch instructions; cache control instructions · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9921839B1 cover?
A multi-core processor includes a plurality of cores to execute a plurality of threads and to monitor metrics for each of the plurality of threads during an interval, the metrics including stall cycle values, prefetches of a first type, and prefetches of a second type. The multi-core processor further includes criticality-aware thread prioritization (CATP) logic to compute a stall fraction for …
Who is the assignee on this patent?
Subramanian Lavanya, Subramoney Sreenivas, Bashyam Nithiyanandan, and 2 more
What technology area does this patent fall under?
Primary CPC classification G06F12/0862. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 20 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).