Programmable ray tracing with hardware acceleration on a graphics processor

US10957095B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10957095-B2
Application numberUS-201816056222-A
CountryUS
Kind codeB2
Filing dateAug 6, 2018
Priority dateAug 6, 2018
Publication dateMar 23, 2021
Grant dateMar 23, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Apparatus and method for programmable ray tracing with hardware acceleration on a graphics processor. For example, one embodiment of a graphics processor comprises shader execution circuitry to execute a plurality of programmable ray tracing shaders. The shader execution circuitry includes a plurality of single instruction multiple data (SIMD) execution units. Sorting circuitry regroups data associated with one or more of the programmable ray tracing shaders to increase occupancy for SIMD operations performed by the SIMD execution units; and fixed-function intersection circuitry coupled to the shader execution circuitry detects intersections between rays and bounding volume hierarchies (BVHs) and/or objects contained therein and to provide results indicating the intersections to the sorting circuitry.

First claim

Opening claim text (preview).

What is claimed is: 1. A graphics processing apparatus comprising: shader execution circuitry to execute a plurality of programmable shaders, the shader execution circuitry including a plurality of single instruction multiple data (SIMD) execution units; fixed function circuitry comprising: sorting circuitry to regroup data associated with one or more of the plurality of programmable shaders to increase occupancy for SIMD operations performed by the plurality of SIMD execution units; and fixed function intersection circuitry to detect intersections between rays and bounding volume hierarchies (BVHs), and to provide results indicating the intersections to the sorting circuitry, wherein the BVHs include top level and bottom level BVHs, a leaf node of a top level BVH comprises one or more traversal shader records, and wherein traversal inside of each BVH is performed using the fixed function intersection circuitry and traversal between the top and bottom level BVHs are implemented in the plurality of programmable shaders to be executed by the plurality of SIMD execution units, wherein a stack for traversal inside of each BVH is stored in the fixed function circuitry, and wherein the stack for the traversal inside of each BVH is truncated prior to the traversal between the top and bottom level BVHs. 2. The graphics processing apparatus of claim 1 wherein the plurality of programmable shaders comprise programmable ray tracing shaders. 3. The graphics processing apparatus of claim 2 wherein the programmable ray tracing shaders include one or more of: a primary shader associated with at least one primary ray, a hit shader to perform bidirectional reflectance distribution function (BRDF) sampling and to launch one or more secondary rays, a traversal shader to traverse one or more rays through the BVHs, and an intersection shader to determine intersections between one or more rays and objects in a scene. 4. The graphics processing apparatus of claim 1 wherein the sorting circuitry further comprises: a content addressable memory to store a plurality of entries, each entry identified by a particular shader record pointer. 5. The graphics processing apparatus of claim 4 further comprising: group dispatch circuitry to dispatch a set of shading tasks grouped in an entry to be simultaneously executed on the plurality of SIMD execution units. 6. The graphics processing apparatus of claim 5 wherein each shading task in the entry is to be mapped to a specific SIMD lane for execution by the plurality of SIMD execution units. 7. The graphics processing apparatus of claim 1 wherein the sorting circuitry further comprises: grouping circuitry to group shading tasks having common shader record pointers within each entry to create SIMD batches for execution on the plurality of SIMD execution units. 8. The graphics processing apparatus of claim 1 wherein each executing instance of a shader is associated with a call stack comprising a storage for arguments passed between a parent shader and one or more child shaders. 9. The graphics processing apparatus of claim 1 further comprising: messaging circuitry coupled to the shader execution circuitry, sorting circuitry, and fixed-function intersection circuitry, the messaging circuitry to send ray-BVH intersection operations spawned from the shader execution circuitry to the fixed-function intersection circuitry and to send callable shaders to the sorting circuitry. 10. A method comprising: executing a plurality of programmable shaders on a plurality of single instruction multiple data (SIMD) execution units; regrouping, by fixed function circuitry, data and program code associated with one or more of the plurality of programmable shaders to increase occupancy for SIMD operations performed by the plurality of SIMD execution units; and detecting intersections between rays and bounding volume hierarchies (BVHs), by the fixed function circuitry, and providing results indicating the intersections for the regrouping, wherein the BVHs include top level and bottom level BVHs, wherein a leaf node of a top level BVH comprises one or more traversal shader records, and wherein traversal inside of each BVH is performed using the fixed function circuitry and traversal between the top and bottom level BVHs are implemented in the plurality of programmable shaders to be executed by the plurality of SIMD execution units, wherein a stack for traversal inside of each BVH is stored in the fixed function circuitry, and wherein the stack for the traversal inside of each BVH is truncated prior to the traversal between the top and bottom level BVHs. 11. The method of claim 10 wherein the plurality of programmable shaders comprise programmable ray tracing shaders. 12. The method of claim 11 wherein the programmable ray tracing shaders include one or more of: a primary shader associated with at least one primary ray, a hit shader to perform bidirectional reflectance distribution function (BRDF) sampling and to launch one or more secondary rays, a traversal shader to traverse one or more rays through the BVHs, and an intersection shader to determine intersections between one or more rays and objects in a scene. 13. The method of claim 10 wherein regrouping comprises sorting the data and program code within a content addressable memory having a plurality of entries, each entry identified by a particular shader record pointer. 14. The method of claim 10 wherein regrouping data and program code further comprises grouping shader tasks having common shader record pointers within each entry to create SIMD batches for execution on the plurality of SIMD execution units. 15. The method of claim 14 further comprising: dispatching a set of shading tasks grouped in an entry to be simultaneously executed on the plurality of SIMD execution units. 16. The method of claim 15 wherein each shading task in the entry is to be mapped to a specific SIMD lane for execution by the plurality of SIMD execution units. 17. The method of claim 10 wherein each executing instance of a shader is associated with a call stack comprising a storage for arguments passed between a parent shader and one or more child shaders. 18. The method of claim 10 further comprising: sending ray-BVH intersection operations spawned from the plurality of SIMD execution units to the fixed function circuitry. 19. A non-transitory machine-readable medium having program code stored thereon which, when executed by a machine, causes the machine to perform the operations of: executing a plurality of programmable shaders on a plurality of single instruction multiple data (SIMD) execution units; regrouping, by fixed function circuitry, data and program code associated with one or more of the plurality of programmable shaders to increase occupancy for SIMD operations performed by the plurality of SIMD execution units; and detecting intersections between rays and bounding volume hierarchies (BVHs), by the fixed function circuitry, and providing results indicating the intersections for the regrouping, wherein the BVHs include top level and bottom level BVHs, wherein a leaf node of a top level BVH comprises one or more traversal shader records, and wherein traversal inside of each BVH is performed using the fixed function circuitry and traversal between the top and bottom level BVHs are implemented in the plurality of programmable shaders to be executed by the plurality of SIMD execution units, wherein a stack for traversal inside of each BVH is stored in the fixed function circuitry,

Assignees

Inventors

Classifications

  • G06T15/005Primary

    General purpose rendering architectures · CPC title

  • Memory management · CPC title

  • Ray-tracing · CPC title

  • Graphics controllers · CPC title

  • Collision detection, intersection · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10957095B2 cover?
Apparatus and method for programmable ray tracing with hardware acceleration on a graphics processor. For example, one embodiment of a graphics processor comprises shader execution circuitry to execute a plurality of programmable ray tracing shaders. The shader execution circuitry includes a plurality of single instruction multiple data (SIMD) execution units. Sorting circuitry regroups data as…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06T15/005. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 23 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).