System, method, and computer program product for implementing software-based scoreboarding

US9612836B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9612836-B2
Application numberUS-201414171671-A
CountryUS
Kind codeB2
Filing dateFeb 3, 2014
Priority dateFeb 3, 2014
Publication dateApr 4, 2017
Grant dateApr 4, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system, method, and computer program product are provided for implementing a software-based scoreboarding mechanism. The method includes the steps of receiving a dependency barrier instruction that includes an immediate value and an identifier corresponding to a first register and, based on a comparison of the immediate value to the value stored in the first register, dispatching a subsequent instruction to at least a first processing unit of two or more processing units.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving a dependency barrier instruction that includes an immediate value and an identifier corresponding to a first register; and based on a comparison of the immediate value to the value stored in the first register, dispatching a subsequent instruction to at least a first processing unit of two or more processing units, wherein a compiler is configured to determine the immediate value by: for a particular dependency barrier instruction, determining a minimum distance from the dependency barrier instruction to a node, and assigning the immediate value to be less than or equal to the minimum distance. 2. The method of claim 1 , wherein the identifier comprises a bit mask, and each bit of the bit mask corresponds to a particular register in a plurality of registers. 3. The method of claim 1 , wherein the identifier comprises an index corresponding to a particular register in a plurality of registers. 4. The method of claim 1 , wherein the first register is included in a plurality of registers, and each register in the plurality of registers comprises a counter of N bits. 5. The method of claim 1 , further comprising decrementing the value stored in the first register when a result is written to a destination register for an instruction that specified the identifier corresponding to the first register. 6. The method of claim 1 , wherein the immediate value comprises an unsigned integer. 7. The method of claim 1 , further comprising: receiving an instruction that includes a second identifier that specifies the first register; dispatching the instruction to either the first processing unit or a second processing unit; and incrementing the value stored in the first register. 8. The method of claim 1 , wherein the comparison comprises determining that the value stored in the first register is less than or equal to the immediate value. 9. The method of claim 1 , wherein determining the minimum distance comprises: if the node is associated with a dependent write, then counting a number of register writes associated with an identifier corresponding to the first register between the dependency barrier instruction and the node; or if the node is associated with a dependent read, then taking the minimum of: a number of register reads associated with an identifier corresponding to the first register between the dependency barrier instruction and the node summed with a number of register writes associated with the identifier corresponding to the first register between the dependency barrier instruction and the node, and a number of register reads associated with the identifier corresponding to the first register between the dependency barrier instruction and the node summed with a number of register writes associated with the identifier corresponding to the first register. 10. A non-transitory computer-readable storage medium storing instructions that, when executed by a processor, cause the processor to perform steps comprising: receiving a dependency barrier instruction that includes an immediate value and an identifier corresponding to a first register; and based on a comparison of the immediate value to the value stored in the first register, dispatching a subsequent instruction to at least a first processing unit of two or more processing units, wherein a compiler is configured to determine the immediate value by: determining a minimum distance from the dependency barrier instruction to a node, and assigning the immediate value to be less than or equal to the minimum distance. 11. The non-transitory computer-readable storage medium of claim 10 , wherein the identifier comprises a bit mask. 12. The non-transitory computer-readable storage medium of claim 10 , wherein the first register is included in a plurality of registers, and each register in the plurality of registers comprises a counter of N bits. 13. The non-transitory computer-readable storage medium of claim 10 , wherein the comparison comprises determining that the value stored in the first register is less than or equal to the immediate value. 14. A system comprising: two or more processing units; and a scheduler unit coupled to the two or more processing units and configured to: receive a dependency barrier instruction that includes an immediate value and an identifier corresponding to a first register, and based on a comparison of the immediate value to the value stored in the first register, dispatch a subsequent instruction to at least a first processing unit of two or more processing units, wherein a compiler is configured to determine the immediate value by: determining a minimum distance from the dependency barrier instruction to a node, and assigning the immediate value to be less than or equal to the minimum distance. 15. The system of claim 14 , wherein the first register is included in a plurality of registers, and each register in the plurality of registers comprises a counter of N bits. 16. The system of claim 14 , wherein the immediate value comprises an unsigned integer. 17. The system of claim 14 , wherein the comparison comprises determining that the value stored in the first register is less than or equal to the immediate value. 18. The non-transitory computer-readable storage medium of claim 10 , wherein determining the minimum distance comprises: if the node is associated with a dependent write, then counting a number of register writes associated with an identifier corresponding to the first register between the dependency barrier instruction and the node; or if the node is associated with a dependent read, then taking the minimum of: a number of register reads associated with an identifier corresponding to the first register between the dependency barrier instruction and the node summed with a number of register writes associated with the identifier corresponding to the first register between the dependency barrier instruction and the node, and a number of register reads associated with the identifier corresponding to the first register between the dependency barrier instruction and the node summed with a number of register writes associated with the identifier corresponding to the first register. 19. The system of claim 14 , wherein determining the minimum distance comprises: if the node is associated with a dependent write, then counting a number of register writes associated with an identifier corresponding to the first register between the dependency barrier instruction and the node; or if the node is associated with a dependent read, then taking the minimum of: a number of register reads associated with an identifier corresponding to the first register between the dependency barrier instruction and the node summed with a number of register writes associated with the identifier corresponding to the first register between the dependency barrier instruction and the node, and a number of register reads associated with the identifier corresponding to the first register between the dependency barrier instruction and the node summed with a number of register writes associated with the identifier corresponding to the first register.

Assignees

Inventors

Classifications

  • Instruction analysis, e.g. decoding, instruction word fields · CPC title

  • Barrier synchronisation · CPC title

  • Dependency mechanisms, e.g. register scoreboarding · CPC title

  • Register arrangements · CPC title

  • Synchronisation or serialisation instructions · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9612836B2 cover?
A system, method, and computer program product are provided for implementing a software-based scoreboarding mechanism. The method includes the steps of receiving a dependency barrier instruction that includes an immediate value and an identifier corresponding to a first register and, based on a comparison of the immediate value to the value stored in the first register, dispatching a subsequent…
Who is the assignee on this patent?
Nvidia Corp
What technology area does this patent fall under?
Primary CPC classification G06F9/30145. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 04 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).