Unified architecture for BVH construction based on hardware pre-sorting and a parallel, reconfigurable clustering array

US10832371B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10832371-B2
Application numberUS-201816236305-A
CountryUS
Kind codeB2
Filing dateDec 28, 2018
Priority dateDec 28, 2018
Publication dateNov 10, 2020
Grant dateNov 10, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An apparatus comprising a sorting unit to sort primitives of a graphics image, the primitives to be grouped, each group to form a first level node of a hierarchical acceleration structure; a parallel reconfigurable clustering array to construct the hierarchical acceleration structure, the parallel reconfigurable clustering array comprising a plurality of processing clusters, each cluster comprising: parallel efficiency analysis circuitry to evaluate different groupings of the first level nodes for a next level of the hierarchical acceleration structure to determine efficiency values for the different groupings; and node merge circuitry to merge the first level nodes based on the efficiency values to form second level nodes.

First claim

Opening claim text (preview).

What is claimed is: 1. An apparatus comprising: sorting circuitry to sort primitives of a graphics image, the primitives to be grouped, each group to form a first level node of a hierarchical acceleration structure; a parallel reconfigurable clustering array to construct the hierarchical acceleration structure, the parallel reconfigurable clustering array comprising a plurality of processing clusters, each cluster comprising: parallel efficiency analysis circuitry to evaluate, based on a cost or distance function, different groupings of the first level nodes for a next level of the hierarchical acceleration structure to determine efficiency values for the different groupings, the cost or distance function comprising a surface area heuristic (SAH) and/or a Euclidean distance function, wherein the SAH function is used to calculate an expected cost of tracing a non-terminating random ray through the graphic image under a particular of grouping of the first level nodes; and node merge circuitry to merge a subset of the first level nodes based on the efficiency values of the different groupings to form second level nodes. 2. The apparatus of claim 1 further comprising: node injection circuitry to inject the second level nodes back to the parallel reconfigurable clustering array, wherein the parallel efficiency analysis circuitry is to evaluate different groupings of the second level nodes to determine efficiency values for the different groupings, and wherein the node merge circuitry is to merge the second level nodes based on the efficiency values to form third level nodes for the next level of the hierarchical acceleration structure. 3. The apparatus of claim 2 wherein the sorting circuitry further comprises: splitting circuitry to split the primitives based on spatial locality in the graphics image; and leaf node generation circuitry to form the first level nodes based on the splitting. 4. The apparatus of claim 1 further comprising: node injection circuitry to store the first level nodes to a memory for access by the parallel reconfigurable clustering array; and a plurality of fetch units, each fetch units associated with one of the plurality of parallel processing clusters, wherein each fetch unit is to fetch one or more of the first level nodes to be processed by its cluster's parallel efficiency analysis circuitry and the node merge circuitry. 5. The apparatus of claim 4 , wherein each fetch unit is further to determine an order in which the fetched one or more first level nodes are submitted to its cluster's parallel efficiency analysis circuitry to be processed. 6. The apparatus of claim 1 wherein the hierarchical acceleration structure comprises a bounding volume hierarchy (BVH). 7. The apparatus of claim 1 wherein the processing clusters of the parallel reconfigurable clustering array each comprise: one or more reconfigurable cells, a reconfigurable cell including a plurality of reconfigurable functional units and a reconfigurable communication fabric interconnecting the reconfigurable functional units. 8. The apparatus of claim 7 wherein a reconfigurable cell further comprises: a scratchpad memory to store data related to the first level nodes and/or second level nodes; and an arithmetic logic unit (ALU) coupled to the scratchpad memory over the reconfigurable fabric. 9. The apparatus of claim 8 , wherein the data stored in the scratchpad memory comprises a best efficiency value evaluated so far and/or a grouping of the first level nodes associated with the best efficiency value. 10. The apparatus of claim 8 , wherein the parallel efficiency analysis circuitry and the node merge circuitry are each implemented via a respective reconfigurable cell, the node merge circuitry to include a larger scratchpad memory and more comparison operators than the parallel efficiency analysis circuitry. 11. A method comprising: sorting primitives of a graphics image; grouping the primitives to form first level nodes of a hierarchical acceleration structure; evaluating, based on a cost or distance function, different groupings of the first level nodes for a next level of the hierarchical acceleration structure to determine efficiency values for the different groupings, the cost or distance function comprising a surface area heuristic (SAH) and/or a Euclidean distance function, wherein the SAH function is used to calculate an expected cost of tracing a non-terminating random ray through the graphic image under a particular of grouping of the first level nodes; and merging a subset of the first level nodes based on the efficiency values of different groupings to form second level nodes of the hierarchical acceleration structure, wherein the operations of evaluating and merging of the first level nodes are performed in parallel on a plurality of parallel processing clusters of a parallel reconfigurable clustering array. 12. The method of claim 11 further comprising: evaluating different groupings of the second level nodes to determine efficiency values for the different groupings, and merging the second level nodes based on the efficiency values to form third level nodes for the next level of the hierarchical acceleration structure, wherein the operations of evaluating and merging of the second level nodes are performed in parallel on the plurality of parallel processing clusters. 13. The method of claim 12 wherein sorting further comprises splitting the primitives based on spatial locality in the graphics image; and wherein the grouping the primitives is performed based on the splitting. 14. The method of claim 11 further comprising: storing the first level nodes to a memory for access by the parallel reconfigurable clustering array; and fetching by a fetch unit of each parallel processing cluster one or more of the first level nodes to be processed to form the second level nodes. 15. The method of claim 11 wherein the hierarchical acceleration structure comprises a bounding volume hierarchy (BVH). 16. The method of claim 11 wherein the processing clusters of the parallel reconfigurable clustering array each comprise: one or more reconfigurable cells, a reconfigurable cell including a plurality of reconfigurable functional units and a reconfigurable communication fabric interconnecting the reconfigurable functional units. 17. The method of claim 16 wherein a reconfigurable cell further comprises: a scratchpad memory to store data related to the first level nodes and/or second level nodes; and an arithmetic logic unit (ALU) coupled to the scratchpad memory over the reconfigurable fabric. 18. A non-transitory machine-readable medium having program code stored thereon which, when executed by a machine, causes the machine to perform the operations of: sorting primitives of a graphics image; grouping the primitives to form first level nodes of a hierarchical acceleration structure; evaluating, based on a cost or distance function, different groupings of the first level nodes for a next level of the hierarchical acceleration structure to determine efficiency values for the different groupings, the cost or distance function comprising a surface area heuristic (SAH) and/or a Euclidean distance function, wherein the SAH function is used to calculate an expected cost of tracing a non-terminating random ray through the graphic image under a particular of grouping of the first level nodes; and merging a subset of the first level nodes based on the efficiency values of the different groupings to form second level nodes of the hie

Assignees

Inventors

Classifications

  • General purpose rendering architectures · CPC title

  • Trees · CPC title

  • using a secondary processor, e.g. coprocessor (peripheral processor G06F13/12) · CPC title

  • G06T15/06Primary

    Ray-tracing · CPC title

  • G06T1/20Primary

    Processor architectures; Processor configuration, e.g. pipelining · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10832371B2 cover?
An apparatus comprising a sorting unit to sort primitives of a graphics image, the primitives to be grouped, each group to form a first level node of a hierarchical acceleration structure; a parallel reconfigurable clustering array to construct the hierarchical acceleration structure, the parallel reconfigurable clustering array comprising a plurality of processing clusters, each cluster compri…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06T15/06. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 10 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).