Distributed address translation in a multi-node interconnect fabric
US-2020159669-A1 · May 21, 2020 · US
US12229870B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12229870-B2 |
| Application number | US-202217982766-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 8, 2022 |
| Priority date | Dec 28, 2018 |
| Publication date | Feb 18, 2025 |
| Grant date | Feb 18, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Apparatus and method for acceleration data structure refit. For example, one embodiment of an apparatus comprises: a ray generator to generate a plurality of rays in a first graphics scene; a hierarchical acceleration data structure generator to construct an acceleration data structure comprising a plurality of hierarchically arranged nodes including inner nodes and leaf nodes stored in a memory in a depth-first search (DFS) order; traversal hardware logic to traverse one or more of the rays through the acceleration data structure; intersection hardware logic to determine intersections between the one or more rays and one or more primitives within the hierarchical acceleration data structure; a node refit unit comprising circuitry and/or logic to read consecutively through at least the inner nodes in the memory in reverse DFS order to perform a bottom-up refit operation on the hierarchical acceleration data structure.
Opening claim text (preview).
What is claimed is: 1. A processor comprising: a central processing unit (CPU) comprising a first plurality cores; a graphics processor comprising a second plurality of cores to execute program code to render images; a memory controller to couple the first and second plurality of cores to a system memory device, the system memory device to be accessible by both of the first plurality of cores of the CPU and the second plurality of cores of the graphics processor using a same virtual address space; and execution circuitry associated with at least one core of the second plurality of cores to execute at least a portion of the graphics program code to perform: constructing an acceleration data structure based on a plurality of primitives located within a three dimensional (3D) space, the acceleration data structure comprising nodes arranged in a hierarchy, each node associated with a bounding volume within the 3D space, the nodes including: a plurality of leaf nodes at a bottom of the hierarchy, each leaf node bounding one or more of the primitives; and one or more inner nodes, each inner node bounding one or more leaf nodes, traversing one or more rays through the acceleration data structure, identifying intersections between the one or more rays and one or more of the primitives, and performing a refit operation to adjust nodes of the acceleration data structure in response to detecting movement of one or more of the primitives to new locations in the 3D space, the refit operation performed entirely on the graphics processor, comprising: adjusting one or more of the leaf nodes based on the new locations of the one or more primitives, wherein the adjusting comprises moving bounding volumes of the leaf nodes to reflect the new locations of the one or more primitives; and adjusting an inner node if a leaf node bounded by the inner node was adjusted. 2. The processor of claim 1 , wherein the processor further comprises: an interconnect coupled to the CPU and the graphics processor. 3. The processor of claim 1 , wherein the first plurality cores includes a first set of cores and a second set of cores that have a lower power consumption than the first set. 4. The processor of claim 1 , wherein the operation of adjusting an inner node comprises merging bounding volumes of the leaf nodes bounded by the inner node. 5. The processor of claim 1 , wherein the refit operation comprises adjusting leaf nodes and inner nodes in a reverse depth-first search (DFS) order. 6. The processor of claim 1 , wherein identifying the intersections between the one or more rays and one or more of the primitives comprises generation of intersection results comprising hit data usable to launch one or more secondary rays. 7. The processor of claim 1 , wherein the acceleration data structure comprises a bounding volume hierarchy. 8. The processor of claim 7 , wherein the leaf nodes and inner nodes comprise 3D volumes within the hierarchy. 9. A processor comprising: a graphics processor comprising a plurality of cores to execute program code to render images; and a system memory device to couple the graphics processor and a central processing unit (CPU), the system memory device to be accessed by the plurality of cores of the graphics processor and the CPU using a same virtual address space, wherein execution circuitry associated with at least one core of the plurality of cores is to execute at least a portion of the graphics program code to perform: constructing an acceleration data structure based on a plurality of primitives located within a three dimensional (3D) space, the acceleration data structure comprising nodes arranged in a hierarchy, each node associated with a bounding volume within the 3D space, the nodes including: a plurality of leaf nodes at a bottom of the hierarchy, each leaf node bounding one or more of the primitives; and one or more inner nodes, each inner node bounding one or more leaf nodes, traversing one or more rays through the acceleration data structure, identifying intersections between the one or more rays and one or more of the primitives, and performing a refit operation to adjust nodes of the acceleration data structure in response to detecting movement of one or more of the primitives to new locations in the 3D space, the refit operation performed entirely on the graphics processor, comprising: adjusting one or more of the leaf nodes based on the new locations of the one or more primitives, wherein the adjusting comprises moving bounding volumes of the leaf nodes to reflect the new locations of the one or more primitives; and adjusting an inner node if a leaf node bounded by the inner node was adjusted. 10. The processor of claim 9 , wherein the processor is coupled to the CPU through an interconnect. 11. The processor of claim 10 , wherein the CPU comprises a first set of cores and a second set of cores that have a lower power consumption than the first set. 12. The processor of claim 9 , wherein the operation of adjusting an inner node comprises merging bounding volumes of the leaf nodes bounded by the inner node. 13. The processor of claim 9 , wherein the refit operation comprises adjusting leaf nodes and inner nodes in a reverse depth-first search (DFS) order. 14. The processor of claim 9 , wherein identifying the intersections between the one or more rays and one or more of the primitives comprises generation of intersection results comprising hit data usable to launch one or more secondary rays. 15. The processor of claim 9 , wherein the acceleration data structure comprises a bounding volume hierarchy. 16. The processor of claim 15 , wherein the leaf nodes and inner nodes comprise 3D volumes within the hierarchy.
Energy efficient computing, e.g. low power processors, power management or thermal management · CPC title
Denoising; Smoothing · CPC title
Memory management · CPC title
Processor architectures; Processor configuration, e.g. pipelining · CPC title
Neural networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.