Multiple-thread processing methods and apparatuses

US10296315B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10296315-B2
Application numberUS-201514816265-A
CountryUS
Kind codeB2
Filing dateAug 3, 2015
Priority dateDec 12, 2014
Publication dateMay 21, 2019
Grant dateMay 21, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Multiple-thread processing apparatuses and methods are provided. The multiple-thread processing method may include searching for loops in a plurality of threads, calculating a number of repetitions of each of found loops in respective threads among the plurality of threads, determining one or more threads based on the calculated number of repetitions of each of the found loops, dividing at least one of the one or more determined threads into child threads, and processing the child threads separately from one another in the plurality of threads.

First claim

Opening claim text (preview).

What is claimed is: 1. A multiple-thread processing method comprising: executing respective code portions for a computer program, each of the respective code portions corresponding to graphic shader code that enables graphics processing; during execution of the respective code portions associated with the graphics processing, searching for loops by finding the loops in the respective code portions of respective threads among a plurality of threads; for each respective found loop executed by the respective code portions, determining whether parallel reduction is possible, and calculating a number of repetitions of the respective found loop for which parallel reduction is possible in each of the respective threads; determining rankings of the respective threads among the plurality of threads in descending order, according to the calculated number of repetitions of the respective found loop in each of the respective threads; selecting one or more threads based on the determined rankings for the respective threads, wherein the selected one or more threads includes at least a top-ranked thread having a largest number of repetitions of the respective found loop among the respective threads; for each selected thread of the selected one or more threads, determining whether the calculated number of repetitions of the respective found loop is non-uniform between the respective threads, and dividing the respective code portions of the selected thread into child threads respectively allocated among the plurality of threads, in response to determining that the respective found loop of the selected thread has the non-uniform number of repetitions between the respective threads; processing the child threads in parallel, each child thread executing separately from one another in each of the plurality of threads to which the child threads are respectively allocated; generating values in parallel for the child threads, one value for each of the separately executed child threads; upon completing processing of the plurality of threads including the child threads and generating the values in parallel for the child threads, performing the determined parallel reduction on the processed threads by merging the values of each of the separately executed child threads to generate a final reduction result value; outputting the final reduction result value; and executing the graphic shader code, utilizing the final reduction result value as input for processing a pixel associated with the graphics processing. 2. The multiple-thread processing method of claim 1 , wherein the selected one or more threads further includes a second-ranked thread having a second largest number of repetitions of the respective found loop among the respective threads. 3. The multiple-thread processing method of claim 1 , wherein outputting the final reduction result value includes displaying the final reduction result value. 4. A multiple-thread processing apparatus comprising: a memory storing computer-readable instructions; and at least one processor configured to execute the computer-readable instructions to, execute respective code portions for a computer program, each of the respective code portions corresponding to graphic shader code that enables graphics processing; during execution of the respective code portions associated with the graphics processing, search for loops by finding loops in respective threads among a plurality of threads; for each respective found loop executed by the respective code portions, determine whether parallel reduction is possible, and calculate a number of repetitions of the respective found loop for which parallel reduction is possible in each of the respective threads; determine rankings of the respective threads among the plurality of threads in descending order, according to the calculated number of repetitions of the respective found loop in each of the respective threads; select one or more threads based on the determined rankings for the respective threads, wherein the selected one or more threads includes at least a top-ranked thread having a largest number of repetitions of the respective found loop among the respective threads; for each selected thread of the selected one or more threads, determine whether the calculated number of repetitions of the respective found loop is non-uniform between the respective threads, and divide the respective code portions of the selected thread into child threads respectively allocated among the plurality of threads, in response to determining that the respective found loop of the selected thread has the non-uniform number of repetitions between the respective threads; process the child threads in parallel, each child thread executing separately from one another in each of the plurality of threads to which the child threads are respectively allocated; generate values in parallel for the child threads, one value for each of the separately executed child threads; upon completing processing of the plurality of threads including the child threads and generating the values in parallel for the child threads, perform the determined parallel reduction on the processed threads by merging the values of each of the separately executed child threads to generate a final reduction result value; output the final reduction result value; and execute the graphic shader code, utilizing the final reduction result value as input for processing a pixel associated with the graphics processing. 5. The multiple-thread processing apparatus of claim 4 , wherein the selected one or more threads further includes a second-ranked thread having a second largest number of repetitions of the respective found loop among the respective threads.

Assignees

Inventors

Classifications

  • controlled by a single instruction for multiple data lanes [SIMD] · CPC title

  • Dependency analysis; Data or control flow analysis · CPC title

  • G06F8/452Primary

    Loops · CPC title

  • Reducing the execution time required by the program code · CPC title

  • Optimisation · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10296315B2 cover?
Multiple-thread processing apparatuses and methods are provided. The multiple-thread processing method may include searching for loops in a plurality of threads, calculating a number of repetitions of each of found loops in respective threads among the plurality of threads, determining one or more threads based on the calculated number of repetitions of each of the found loops, dividing at leas…
Who is the assignee on this patent?
Samsung Electronics Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06F8/452. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 21 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).