Collectively loading an application in a parallel computer

US9229782B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9229782-B2
Application numberUS-201213431248-A
CountryUS
Kind codeB2
Filing dateMar 27, 2012
Priority dateMar 27, 2012
Publication dateJan 5, 2016
Grant dateJan 5, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Collectively loading an application in a parallel computer, the parallel computer comprising a plurality of compute nodes, including: identifying, by a parallel computer control system, a subset of compute nodes in the parallel computer to execute a job; selecting, by the parallel computer control system, one of the subset of compute nodes in the parallel computer as a job leader compute node; retrieving, by the job leader compute node from computer memory, an application for executing the job; and broadcasting, by the job leader to the subset of compute nodes in the parallel computer, the application for executing the job.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of collectively loading an application in a parallel computer, the parallel computer comprising a plurality of compute nodes, the method comprising steps of: identifying, by a parallel computer control system, a subset of compute nodes in the parallel computer to execute a data processing job; selecting, by the parallel computer control system, one compute node in the subset of compute nodes in the parallel computer as a job leader compute node; retrieving, an application for executing the data processing job from a computer memory in the parallel computer and loading into a memory of the job leader node; performing, by the subset of compute nodes, a collective broadcast operation, including broadcasting, by the job leader compute node to the subset of compute nodes in the parallel computer, the application for executing the data processing job; and sending, from the parallel computer control system to the subset of compute nodes in the parallel computer, control information about the data processing job, wherein the control information comprises a particular path through the subset of compute nodes that messages should be passed, wherein the particular path is based on a message type. 2. The method of claim 1 further comprising configuring, by each compute node in the subset of compute nodes in the parallel computer, network parameters in dependence upon the control information. 3. The method of claim 1 wherein selecting one of the subset of compute nodes in the parallel computer as a job leader compute node further comprises selecting a root compute node of a collective network comprised of the subset of compute nodes in the parallel computer. 4. The method of claim 1 further comprising receiving, by the job leader compute node, an acknowledgment message from each compute node in the subset of compute nodes in the parallel computer, wherein the acknowledgment message specifies that the application has been received by the sender of the acknowledgment message. 5. The method of claim 1 further comprising loading, by each compute node in the subset of compute nodes in the parallel computer, the application. 6. An apparatus for collectively loading an application in a parallel computer, the parallel computer comprising a plurality of compute nodes, the apparatus comprising a computer processor, a computer memory operatively coupled to the computer processor, the computer memory having disposed within it computer program instructions that, when executed by the computer processor, cause the apparatus to carry out the steps of: identifying, by a parallel computer control system, a subset of compute nodes in the parallel computer to execute a data processing job; selecting, by the parallel computer control system, one compute node in the subset of compute nodes in the parallel computer as a job leader compute node; retrieving, an application for executing the data processing job from a computer memory in the parallel computer and loading into a memory of the job leader node; performing, by the subset of compute nodes, a collective broadcast operation, including broadcasting, by the job leader compute node to the subset of compute nodes in the parallel computer, the application for executing the data processing job; and sending, from the parallel computer control system to the subset of compute nodes in the parallel computer, control information about the data processing job, wherein the control information comprises a particular path through the subset of compute nodes that messages should be passed, wherein the particular path is based on a message type. 7. The apparatus of claim 6 further comprising computer program instructions that, when executed by the computer processor, cause the apparatus to carry out the step of configuring, by each compute node in the subset of compute nodes in the parallel computer, network parameters in dependence upon the control information. 8. The apparatus of claim 6 wherein selecting one of the subset of compute nodes in the parallel computer as a job leader compute node further comprises selecting a root compute node of a collective network comprised of the subset of compute nodes in the parallel computer. 9. The apparatus of claim 6 further comprising computer program instructions that, when executed by the computer processor, cause the apparatus to carry out the step of receiving, by the job leader compute node, an acknowledgment message from each compute node in the subset of compute nodes in the parallel computer, wherein the acknowledgment message specifies that the application has been received by the sender of the acknowledgment message. 10. The apparatus of claim 6 further comprising computer program instructions that, when executed by the computer processor, cause the apparatus to carry out the step of loading, by each compute node in the subset of compute nodes in the parallel computer, the application. 11. A computer program product for loading an application in a parallel computer, the parallel computer comprising a plurality of compute nodes, the computer program product disposed upon a non-transitory computer readable medium, the computer program product comprising computer program instructions that, when executed, cause a computer to carry out the steps of: identifying, by a parallel computer control system, a subset of compute nodes in the parallel computer to execute a data processing job; selecting, by the parallel computer control system, one compute node in the subset of compute nodes in the parallel computer as a job leader compute node; retrieving, an application for executing the data processing job from a computer memory in the parallel computer and loading into a memory of the job leader node; performing, by the subset of compute nodes, a collective broadcast operation, including broadcasting, by the job leader compute node to the subset of compute nodes in the parallel computer, the application for executing the data processing job; and sending, from the parallel computer control system to the subset of compute nodes in the parallel computer, control information about the data processing job, wherein the control information comprises a particular path through the subset of compute nodes that messages should be passed, wherein the particular path is based on a message type. 12. The computer program product of claim 11 further comprising computer program instructions that, when executed, cause a computer to carry out the step of configuring, by each compute node in the subset of compute nodes in the parallel computer, network parameters in dependence upon the control information. 13. The computer program product of claim 11 wherein selecting one of the subset of compute nodes in the parallel computer as a job leader compute node further comprises selecting a root compute node of a collective network comprised of the subset of compute nodes in the parallel computer. 14. The computer program product of claim 11 further comprising computer program instructions that, when executed, cause a computer to carry out the step of receiving, by the job leader compute node, an acknowledgment message from each compute node in the subset of compute nodes in the parallel computer, wherein the acknowledgment message specifies that the application has been received by the sender of the acknowledgment message. 15. The computer program product of claim 11 further comprising computer program instructions that, when executed, cause a computer to carry out the step of loading, by each compute node in the subset of compute nodes in the parallel

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9229782B2 cover?
Collectively loading an application in a parallel computer, the parallel computer comprising a plurality of compute nodes, including: identifying, by a parallel computer control system, a subset of compute nodes in the parallel computer to execute a job; selecting, by the parallel computer control system, one of the subset of compute nodes in the parallel computer as a job leader compute node; …
Who is the assignee on this patent?
Aho Michael E, Attinella John E, Gooding Thomas M, and 3 more
What technology area does this patent fall under?
Primary CPC classification G06F9/5072. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 05 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).