Methods and systems to identify and migrate threads among system nodes based on system performance metrics

US9304811B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9304811-B2
Application numberUS-201213994574-A
CountryUS
Kind codeB2
Filing dateJun 29, 2012
Priority dateJun 29, 2012
Publication dateApr 5, 2016
Grant dateApr 5, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and systems to identify and migrate threads among system nodes based on system performance metrics. An example method disclosed herein includes sampling a performance metric of a computer program thread, the computer program thread executing on a home node of a computer system having multiple nodes, and determining whether the performance metric exceeds a threshold value. The method also includes identifying a remote node associated with a remote memory if the threshold value is exceeded, the remote memory being accessed by the computer program thread, and identifying the computer program thread as a candidate for migration from the home node to the remote node if the threshold value is exceeded. In this way, a computer program thread that frequently accesses a remote memory can be migrated from a home node to a remote node associated with the remote memory to reduce the latency associated with memory accesses performed by the computer program thread and thereby improve system performance.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method comprising: sampling, with a processor, a performance metric associated with execution of a computer program thread on a home node of a computer system, the computer system having multiple nodes including the home node, the computer program thread being a first computer program thread executing on the home node; determining whether the performance metric satisfies a threshold value; tagging a first set of memory operations from among a randomly selected second set of memory operations, the second set of memory operations being performed by a plurality of computer program threads, including the first program thread, executing on the multiple nodes; if the performance metric satisfies the threshold value: using memory operation information of the first set of memory operations to identify one of the multiple nodes as being a remote node having a remote memory accessed by the first computer program thread; and identifying the first computer program thread as a candidate for migration from the home node to the remote node. 2. A method as defined in claim 1 , wherein the performance metric includes at least one of: memory accesses executed by the first computer program thread; or power consumption associated with the execution of the first computer program thread. 3. A method as defined in claim 1 , wherein one or more of the nodes of the computer system is a mobile device. 4. A method as defined in claim 1 , further including: if the first computer program thread has been identified as a candidate for migration, identifying the remote node as a preferred home node for the first computer program thread; and migrating the first computer program thread to the preferred home node. 5. A method as defined in claim 1 , wherein the performance metric includes at least one of: a number of times that the first computer program thread experiences a low level cache miss; a number of times that the first computer program thread accesses the remote memory; or a number of times that the first computer program thread accesses a local memory of the home node. 6. A method as defined in claim 5 , wherein the performance metric further includes a ratio of the number of times that the first computer program thread accesses the remote memory to a number of times that the first computer program thread accesses the local memory. 7. A method as defined in claim 6 , wherein the threshold value includes: a first threshold value indicating the number of times that the first computer program thread experiences the low level cache miss; a second threshold value indicating the number of times that the first computer program thread accesses the remote memory; a third threshold value indicating the number of times that the first computer program thread accesses the local memory; and a fourth threshold value indicating the ratio. 8. A method as defined in claim 1 , further including: capturing thread identifying information if the performance metric satisfies the threshold value, the thread identifying information including an identity of the first computer program thread and an identity of a processor executing the first computer program thread; and determining an identity of the home node using the thread identifying information. 9. A method as defined in claim 1 , further including: determining that the first computer program thread is memory intensive if the threshold value is satisfied; monitoring an amount of time that the first computer program thread is memory intensive; and determining that the first computer program thread is persistently memory intensive if the amount of time exceeds a threshold duration of time. 10. A method as defined in claim 1 , wherein an identity of the remote memory is used to determine an identity of the remote node. 11. A method as defined in claim 1 , wherein the identifying of the first computer program thread as the candidate for migration includes determining that the performance metric satisfies the threshold value for a threshold duration of time. 12. An apparatus comprising: a first data collector to sample a performance metric value associated with the execution of a computer program thread on a home node of a computer system, the computer system having a plurality of nodes including the home node; a first monitor to determine whether the performance metric satisfies a threshold value; a second data collector to collect thread identifying information; a thread identifier to use the thread identifying information to determine an identity of the first computer program thread; a third data collector to collect memory operation information from a set of randomly tagged memory operations; a node identifier to use the memory operation information to determine an identity of a remote memory accessed by the first computer program thread, the node identifier to use the identity of the remote memory to determine, from among the plurality of nodes, an identity of a remote node; a second monitor to determine whether the performance metric satisfies the threshold value for a threshold duration of time; and a migration candidate identifier to identify the first computer program thread as a candidate for migration from the home node to the remote node responsive to the determination of the second monitor, wherein at least one of the first data collector, the first monitor, the second data collector, the thread identifier, the third data collector, the node identifier, the second monitor and the migration candidate identifier include a processor. 13. An apparatus as defined in claim 12 , wherein the performance metric includes at least one of: memory accesses by the first computer program thread; or power consumption associated with the execution of the first computer program thread. 14. An apparatus as defined in claim 12 , wherein one or more of the nodes of the computer system operate on a mobile device. 15. An apparatus as defined in claim 12 , further including a scheduler to migrate the first computer program thread from the home node to the remote node. 16. An apparatus as defined in claim 12 , wherein the performance metric includes at least one of: a number of times that the first computer program thread experiences a low level cache miss; a number of times that the first computer program thread accesses the remote memory; or a number of times that the first computer program thread accesses a local memory of the home node. 17. A tangible computer readable medium comprising machine readable instructions which, when executed, cause a machine to at least: sample a performance metric associated with the execution of a computer program thread on a home node of a computer system, the computer system having a plurality of nodes including the home node, the computer program thread being a first computer program thread of a plurality of computer program threads executing on the plurality of nodes; determine whether the performance metric satisfies a threshold value; tag a first set of memory operations from among a randomly selected second set of memory operations, the second set of memory operations being performed by the plurality of computer program threads; if the performance metric satisfies the threshold value: use the tags of the first set of memory operations to identify a remote node from among the plurality of nodes, the remote node having a remote memory accessed by the first computer program thread; and identify the first computer program thread as a candidate for migration fro

Assignees

Inventors

Classifications

  • G06F9/4856Primary

    resumption being on a different machine, e.g. task migration, virtual machine migration (G06F9/5088 takes precedence) · CPC title

  • Cross-Sectional Technologies · mapped topic

  • Cross-Sectional Technologies · mapped topic

  • G06F9/5016Primary

    the resource being the memory · CPC title

  • Energy efficient computing, e.g. low power processors, power management or thermal management · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9304811B2 cover?
Methods and systems to identify and migrate threads among system nodes based on system performance metrics. An example method disclosed herein includes sampling a performance metric of a computer program thread, the computer program thread executing on a home node of a computer system having multiple nodes, and determining whether the performance metric exceeds a threshold value. The method als…
Who is the assignee on this patent?
Yao Jin, Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06F9/4856. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 05 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).