Agglomerative treelet restructuring for bounding volume hierarchies
US-9817919-B2 · Nov 14, 2017 · US
US10832371B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10832371-B2 |
| Application number | US-201816236305-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 28, 2018 |
| Priority date | Dec 28, 2018 |
| Publication date | Nov 10, 2020 |
| Grant date | Nov 10, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An apparatus comprising a sorting unit to sort primitives of a graphics image, the primitives to be grouped, each group to form a first level node of a hierarchical acceleration structure; a parallel reconfigurable clustering array to construct the hierarchical acceleration structure, the parallel reconfigurable clustering array comprising a plurality of processing clusters, each cluster comprising: parallel efficiency analysis circuitry to evaluate different groupings of the first level nodes for a next level of the hierarchical acceleration structure to determine efficiency values for the different groupings; and node merge circuitry to merge the first level nodes based on the efficiency values to form second level nodes.
Opening claim text (preview).
What is claimed is: 1. An apparatus comprising: sorting circuitry to sort primitives of a graphics image, the primitives to be grouped, each group to form a first level node of a hierarchical acceleration structure; a parallel reconfigurable clustering array to construct the hierarchical acceleration structure, the parallel reconfigurable clustering array comprising a plurality of processing clusters, each cluster comprising: parallel efficiency analysis circuitry to evaluate, based on a cost or distance function, different groupings of the first level nodes for a next level of the hierarchical acceleration structure to determine efficiency values for the different groupings, the cost or distance function comprising a surface area heuristic (SAH) and/or a Euclidean distance function, wherein the SAH function is used to calculate an expected cost of tracing a non-terminating random ray through the graphic image under a particular of grouping of the first level nodes; and node merge circuitry to merge a subset of the first level nodes based on the efficiency values of the different groupings to form second level nodes. 2. The apparatus of claim 1 further comprising: node injection circuitry to inject the second level nodes back to the parallel reconfigurable clustering array, wherein the parallel efficiency analysis circuitry is to evaluate different groupings of the second level nodes to determine efficiency values for the different groupings, and wherein the node merge circuitry is to merge the second level nodes based on the efficiency values to form third level nodes for the next level of the hierarchical acceleration structure. 3. The apparatus of claim 2 wherein the sorting circuitry further comprises: splitting circuitry to split the primitives based on spatial locality in the graphics image; and leaf node generation circuitry to form the first level nodes based on the splitting. 4. The apparatus of claim 1 further comprising: node injection circuitry to store the first level nodes to a memory for access by the parallel reconfigurable clustering array; and a plurality of fetch units, each fetch units associated with one of the plurality of parallel processing clusters, wherein each fetch unit is to fetch one or more of the first level nodes to be processed by its cluster's parallel efficiency analysis circuitry and the node merge circuitry. 5. The apparatus of claim 4 , wherein each fetch unit is further to determine an order in which the fetched one or more first level nodes are submitted to its cluster's parallel efficiency analysis circuitry to be processed. 6. The apparatus of claim 1 wherein the hierarchical acceleration structure comprises a bounding volume hierarchy (BVH). 7. The apparatus of claim 1 wherein the processing clusters of the parallel reconfigurable clustering array each comprise: one or more reconfigurable cells, a reconfigurable cell including a plurality of reconfigurable functional units and a reconfigurable communication fabric interconnecting the reconfigurable functional units. 8. The apparatus of claim 7 wherein a reconfigurable cell further comprises: a scratchpad memory to store data related to the first level nodes and/or second level nodes; and an arithmetic logic unit (ALU) coupled to the scratchpad memory over the reconfigurable fabric. 9. The apparatus of claim 8 , wherein the data stored in the scratchpad memory comprises a best efficiency value evaluated so far and/or a grouping of the first level nodes associated with the best efficiency value. 10. The apparatus of claim 8 , wherein the parallel efficiency analysis circuitry and the node merge circuitry are each implemented via a respective reconfigurable cell, the node merge circuitry to include a larger scratchpad memory and more comparison operators than the parallel efficiency analysis circuitry. 11. A method comprising: sorting primitives of a graphics image; grouping the primitives to form first level nodes of a hierarchical acceleration structure; evaluating, based on a cost or distance function, different groupings of the first level nodes for a next level of the hierarchical acceleration structure to determine efficiency values for the different groupings, the cost or distance function comprising a surface area heuristic (SAH) and/or a Euclidean distance function, wherein the SAH function is used to calculate an expected cost of tracing a non-terminating random ray through the graphic image under a particular of grouping of the first level nodes; and merging a subset of the first level nodes based on the efficiency values of different groupings to form second level nodes of the hierarchical acceleration structure, wherein the operations of evaluating and merging of the first level nodes are performed in parallel on a plurality of parallel processing clusters of a parallel reconfigurable clustering array. 12. The method of claim 11 further comprising: evaluating different groupings of the second level nodes to determine efficiency values for the different groupings, and merging the second level nodes based on the efficiency values to form third level nodes for the next level of the hierarchical acceleration structure, wherein the operations of evaluating and merging of the second level nodes are performed in parallel on the plurality of parallel processing clusters. 13. The method of claim 12 wherein sorting further comprises splitting the primitives based on spatial locality in the graphics image; and wherein the grouping the primitives is performed based on the splitting. 14. The method of claim 11 further comprising: storing the first level nodes to a memory for access by the parallel reconfigurable clustering array; and fetching by a fetch unit of each parallel processing cluster one or more of the first level nodes to be processed to form the second level nodes. 15. The method of claim 11 wherein the hierarchical acceleration structure comprises a bounding volume hierarchy (BVH). 16. The method of claim 11 wherein the processing clusters of the parallel reconfigurable clustering array each comprise: one or more reconfigurable cells, a reconfigurable cell including a plurality of reconfigurable functional units and a reconfigurable communication fabric interconnecting the reconfigurable functional units. 17. The method of claim 16 wherein a reconfigurable cell further comprises: a scratchpad memory to store data related to the first level nodes and/or second level nodes; and an arithmetic logic unit (ALU) coupled to the scratchpad memory over the reconfigurable fabric. 18. A non-transitory machine-readable medium having program code stored thereon which, when executed by a machine, causes the machine to perform the operations of: sorting primitives of a graphics image; grouping the primitives to form first level nodes of a hierarchical acceleration structure; evaluating, based on a cost or distance function, different groupings of the first level nodes for a next level of the hierarchical acceleration structure to determine efficiency values for the different groupings, the cost or distance function comprising a surface area heuristic (SAH) and/or a Euclidean distance function, wherein the SAH function is used to calculate an expected cost of tracing a non-terminating random ray through the graphic image under a particular of grouping of the first level nodes; and merging a subset of the first level nodes based on the efficiency values of the different groupings to form second level nodes of the hie
Related publications grouped by family.
Answers are generated from the same data shown on this page.