Compute work distribution reference counters

US9507638B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9507638-B2
Application numberUS-201113291369-A
CountryUS
Kind codeB2
Filing dateNov 8, 2011
Priority dateNov 8, 2011
Publication dateNov 29, 2016
Grant dateNov 29, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

One embodiment of the present invention sets forth a technique for managing the allocation and release of resources during multi-threaded program execution. Programmable reference counters are initialized to values that limit the amount of resources for allocation to tasks that share the same reference counter. Resource parameters are specified for each task to define the amount of resources allocated for consumption by each array of execution threads that is launched to execute the task. The resource parameters also specify the behavior of the array for acquiring and releasing resources. Finally, during execution of each thread in the array, an exit instruction may be configured to override the release of the resources that were allocated to the array. The resources may then be retained for use by a child task that is generated during execution of a thread.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method of allocating and releasing architectural resources in a multi-threaded system, the method comprising: allocating the architectural resources to a first thread array including a first plurality of threads to execute a parent processing task; determining, by one or more threads included in the first plurality of threads and during execution of the parent processing task, whether a release of the architectural resources is to be overridden when a first thread included in the first plurality of threads has exited based on the existence of a child processing task generated from the parent processing task and associated with at least one thread included in the first plurality of threads; releasing the architectural resources when the first thread has exited and no thread included in the first plurality of threads has determined that the release of the architectural resources is to be overridden; and retaining the architectural resources when the first thread has exited and at least one of the one or more threads included in the first plurality of threads has determined that the release of the architectural resources is to be overridden. 2. The method of claim 1 , wherein a second thread included in the first plurality of threads generates the child processing task during execution of the parent processing task and determines that the release of the architectural resources is to be overridden when the first thread has exited. 3. The method of claim 2 , further comprising allocating additional architectural resources for the child processing task from a separate pool that does not include the architectural resources allocated to the first thread array. 4. The method of claim 2 , further comprising generating a continuation task configured to complete the parent processing task executed by the first thread and release the architectural resources allocated to the first thread array. 5. The method of claim 4 , wherein the continuation task consumes at least a portion of the architectural resources allocated to the first thread array. 6. The method of claim 4 , wherein execution of the continuation task begins after execution of the child processing task is complete. 7. The method of claim 1 , further comprising initializing a reference counter that limits a maximum quantity of architectural resources in a pool that are available for allocation. 8. The method of claim 7 , wherein resource parameters are specified by the parent processing task, and the resource parameters include a delta value indicating a quantity of the architectural resources needed for the first thread array. 9. The method of claim 8 , further comprising determining, before allocating the architectural resources to the first thread array, that the delta value is not greater than the reference counter. 10. The method of claim 8 , wherein the resource parameters include an allocation enable flag, and further comprising updating the reference counter based on the delta value when the allocation enable flag is asserted. 11. The method of claim 8 , wherein the resource parameters include an allocation release enable flag, and further comprising not updating the reference counter when the allocation release enable flag is negated. 12. The method of claim 8 , wherein the resource parameters include an allocation release enable flag, and further comprising not updating the reference counter when the allocation release enable flag is asserted and the at least one thread included in the first plurality of threads determined that the release of the architectural resources would be overridden. 13. The method of claim 7 , wherein the reference counter represents a combination of different architectural resources including a number of barrier counters. 14. A non-transitory computer-readable storage medium storing instructions that, when executed by a processor, cause the processor to allocate and release architectural resources in a multi-threaded system, by performing the steps of: allocating the architectural resources to a first thread array including a first plurality of threads to execute a parent processing task; determining, by one or more threads included in the first plurality of threads and during execution of the parent processing task, whether a release of the architectural resources is to be overridden when a first thread included in the first plurality of threads has exited based on the existence of a child processing task generated from the parent processing task and associated with at least one thread included in the first plurality of threads; releasing the architectural resources when the first thread has exited and no thread included in the first plurality of threads has determined that the release of the architectural resources is to would be overridden; and retaining the architectural resources when the first thread has exited and at least one of the one or more threads included in the first plurality of threads has determined that the release of the architectural resources is to be overridden. 15. The non-transitory computer-readable storage medium of claim 14 , wherein a second thread included in the first plurality of threads generates the child processing task during execution of the parent processing task and determines that the release of the architectural resources is to be overridden when the first thread has exited. 16. The non-transitory computer-readable storage medium of claim 14 , further comprising initializing a reference counter that limits a maximum quantity of architectural resources in a pool that are available for allocation. 17. A multi-threaded system configured to allocate and release architectural resources, comprising: a memory configured to store program instructions corresponding to a parent processing task; a general processing cluster configured to process a first thread array including a first plurality of threads to execute the parent processing task, wherein, during execution of the parent processing task, one or more threads included in the first plurality of threads determines whether a release of the architectural resources is to be overridden when a first thread included in the first plurality of threads has exited based on the existence of a child processing task generated from the parent processing task and associated with at least one thread included in the first plurality of threads; a work distribution unit coupled to the general processing cluster and configured to: allocate the architectural resources to the first thread array; release the architectural resources when the first thread has exited and no thread included in the first plurality of threads has determined that the release of the architectural resources is to be overridden; and retain the architectural resources when the first thread has exited and at least one of the one or more threads included in the first plurality of threads has determined that the release of the architectural resources is to be overridden. 18. The multi-threaded system of claim 17 , wherein a second thread included in the first thread array generates the child processing task during execution of the parent processing task and determines that the release of the architectural resources is to be overridden when the first thread has all included exited. 19. The multi-threaded system of claim 17 , wherein the work distribution unit is further configured to initialize a reference counter that limits a maximum quantity of architectural resources in a pool that are

Assignees

Inventors

Classifications

  • G06F9/5022Primary

    Mechanisms to release resources · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9507638B2 cover?
One embodiment of the present invention sets forth a technique for managing the allocation and release of resources during multi-threaded program execution. Programmable reference counters are initialized to values that limit the amount of resources for allocation to tasks that share the same reference counter. Resource parameters are specified for each task to define the amount of resources al…
Who is the assignee on this patent?
Cuadra Philip Alexander, Abdalla Karim M, Duluk Jr Jerome F, and 5 more
What technology area does this patent fall under?
Primary CPC classification G06F9/5022. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 29 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).