Method and apparatus for a hierarchical synchronization barrier in a multi-node system

US9286067B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9286067-B2
Application numberUS-201213614460-A
CountryUS
Kind codeB2
Filing dateSep 13, 2012
Priority dateJan 10, 2011
Publication dateMar 15, 2016
Grant dateMar 15, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A hierarchical barrier synchronization of cores and nodes on a multiprocessor system, in one aspect, may include providing by each of a plurality of threads on a chip, input bit signal to a respective bit in a register, in response to reaching a barrier; determining whether all of the plurality of threads reached the barrier by electrically tying bits of the register together and “AND”ing the input bit signals; determining whether only on-chip synchronization is needed or whether inter-node synchronization is needed; in response to determining that all of the plurality of threads on the chip reached the barrier, notifying the plurality of threads on the chip, if it is determined that only on-chip synchronization is needed; and after all of the plurality of threads on the chip reached the barrier, communicating the synchronization signal to outside of the chip, if it is determined that inter-node synchronization is needed.

First claim

Opening claim text (preview).

We claim: 1. A non-transitory computer readable storage medium storing a program of instructions executable by a machine to perform a method for a hierarchical barrier synchronization of cores and nodes on a multiprocessor system, comprising: providing a mask bit register and a status bit register for each of a plurality of threads on a chip; receiving from a thread on the chip, an input bit signal to the status bit register associated with the thread, in response to reaching a barrier; for each of the plurality of threads, “AND”ing at least the input bit signal with a mask signal stored in the mask bit register associated with the thread to produce a first output, and “OR”ing at least an inverted mask signal associated with the thread with the first output to produce a second output; determining whether all participants in synchronization process reached the barrier by “AND”ing the second outputs associated with the plurality of threads; determining whether only on-chip synchronization is needed or whether inter-node synchronization is needed; in response to determining that all of the participants on the chip reached the barrier, notifying the participants on the chip if it is determined that only on-chip synchronization is needed; and after all of the participants on the chip reached the barrier, communicating the synchronization signal to outside of the chip if it is determined that inter-node synchronization is needed. 2. The non-transitory computer readable storage medium of claim 1 , wherein the plurality of threads on the chip are heterogeneous wherein at least one of the plurality of threads is a component of a power efficient core and another of the plurality of threads is a component of a processing core. 3. The non-transitory computer readable storage medium of claim 1 , wherein the notifying the plurality of threads includes generating an interrupt, waking up one or more of the plurality of threads, or setting a bit indicating the barrier has been achieved, or combinations thereof. 4. The non-transitory computer readable storage medium of claim 1 , further including: in response to determining that the inter-node synchronization is needed, hierarchically integrating the synchronization signal into a system synchronization; and propagating a global synchronization signal back to one or more lower levels of synchronization down to all threads participating in the barrier. 5. The non-transitory computer readable storage medium of claim 4 , wherein a plurality of thread chips participate in the system synchronization and the plurality of thread chips are heterogeneous. 6. The non-transitory computer readable storage medium of claim 1 , wherein said each of a plurality of threads on a chip is programmed to sleep after providing the input bit signal. 7. The non-transitory computer readable storage medium of claim 6 , wherein said notifying the plurality of threads on the chip wakes up said each of a plurality of threads on a chip. 8. An apparatus for a hierarchical barrier synchronization of cores and nodes on a multiprocessor system, comprising: a plurality of cores arranged in an integrated circuit; a status bit register and a mask bit register associated with a core, for each of the plurality of cores; the status bit register operable to store input bit signals received from each of said plurality of cores; a control logic circuit operable to perform a Boolean “AND” function on the input bit signal and a mask signal stored in the mask bit register associated with a core to produce a first output for each of the plurality of cores, and perform a Boolean “OR” function on an inverted mask signal associated with the core and the first output to produce a second output for each of the plurality of cores, the control logic circuit further operable to perform a Boolean “AND” function on the second outputs associated with the plurality of cores to determine whether said plurality of cores participating in synchronization process all reached barrier, the control logic circuit further operable to determine whether only on-chip synchronization is needed or whether inter-node synchronization is needed, and in response to determining that all of the plurality of cores on the integrated circuit participating in the synchronization process reached the barrier, notifying the plurality of cores on the chip if it is determined that only on-chip synchronization is needed, and after all of the plurality of cores on the integrated circuit participating in the synchronization process reached the barrier, communicating the synchronization signal to outside of the integrated circuit if it is determined that inter-node synchronization is needed. 9. The apparatus of claim 8 , wherein the plurality of cores on the integrated circuit are heterogeneous. 10. The apparatus of claim 8 , wherein the notifying the plurality of cores on the integrated circuit includes transmitting a synchronization signal to said each of a plurality of cores on the integrated circuit. 11. The apparatus of claim 8 , further including: in response to determining that the inter-node synchronization is needed, hierarchically integrating the synchronization signal into a system synchronization; and propagating a global synchronization signal back to one or more lower levels of synchronization down to all cores participating in the barrier. 12. The apparatus of claim 11 , wherein a plurality of integrated circuits participate in the system synchronization and the plurality of integrated circuits are heterogeneous. 13. The apparatus of claim 8 , wherein said each of a plurality of cores in the integrated circuit is programmed to sleep after providing the input bit signal. 14. The apparatus of claim 13 , wherein said notifying the plurality of cores in the integrated circuit wakes up said each of a plurality of cores participating in the synchronization process. 15. The apparatus of claim 14 , further including instruction set architecture that includes an instruction to set a barrier and an instruction to wake up one or more cores. 16. The apparatus of claim 8 , wherein the mask bit register stores an indication of whether the corresponding core is participating in barrier synchronization.

Assignees

Inventors

Classifications

  • from multiple instruction streams, e.g. multistreaming · CPC title

  • Synchronisation or serialisation instructions · CPC title

  • G06F9/522Primary

    Barrier synchronisation · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9286067B2 cover?
A hierarchical barrier synchronization of cores and nodes on a multiprocessor system, in one aspect, may include providing by each of a plurality of threads on a chip, input bit signal to a respective bit in a register, in response to reaching a barrier; determining whether all of the plurality of threads reached the barrier by electrically tying bits of the register together and “AND”ing the i…
Who is the assignee on this patent?
Salapura Valentina, Wisniewski Robert W, IBM
What technology area does this patent fall under?
Primary CPC classification G06F9/30087. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 15 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).