Ray tracing processor

US12045928B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12045928-B2
Application numberUS-202217665341-A
CountryUS
Kind codeB2
Filing dateFeb 4, 2022
Priority dateFeb 4, 2022
Publication dateJul 23, 2024
Grant dateJul 23, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and techniques are provided for enhancing operations of a ray tracing processor. For instance, a process can include obtaining one or more nodes of an acceleration data structure. Each node of the one or more nodes includes the same number of bytes. The node(s) can be stored in a cache associated with a ray tracing processor. Each of the stored node(s) are cache line-aligned with the cache associated with the ray tracing processor. A first stored node of the stored node(s) can be provided to the ray tracing processor and processed by the ray tracing processor during a first clock cycle of the ray tracing processor. A second stored node of the stored node(s) can be provided to the ray tracing processor and processed by the ray tracing processor during a second clock cycle of the ray tracing processor.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of ray tracing, the method comprising: obtaining one or more nodes of an acceleration data structure, wherein each node of the one or more nodes is a constant-sized node including a constant number of bytes; storing the one or more nodes in a cache associated with a ray tracing processor, wherein each of the one or more stored nodes are cache line-aligned with the cache associated with the ray tracing processor; providing a first stored node of the one or more stored nodes for processing by a ray-node intersection logic unit of the ray tracing processor, wherein the ray-node intersection logic unit includes a shared floating point arithmetic logic unit (ALU), and wherein the ray-node intersection logic unit uses a first configuration of the shared floating point ALU to determine two or more ray-triangle intersections corresponding to the first stored node within a first clock cycle of the ray tracing processor; and providing a second stored node of the one or more stored nodes for processing by the ray-node intersection logic unit of the ray tracing processor, wherein the ray-node intersection logic unit uses a second configuration of the shared floating point ALU to determine four or more ray-bounding volume intersections corresponding to the second stored node within a second clock cycle of the ray tracing processor, wherein the first clock cycle and the second clock cycle are consecutive clock cycles. 2. The method of claim 1 , wherein: the ray-node intersection logic unit of the ray tracing processor is configured to determine the two or more ray-triangle intersections based on two or more triangles included in the first stored node; and the ray-node intersection logic unit of the ray tracing processor is configured to determine the four or more ray-bounding volume intersections based on four or more bounding volumes included in the second stored node. 3. The method of claim 2 , wherein: the two or more ray-triangle intersections based on the first stored node are determined during the first clock cycle of the ray tracing processor; and the four or more ray-bounding volume intersections based on the second stored node are determined during the second clock cycle of the ray tracing processor. 4. The method of claim 1 , wherein: the first stored node comprises a leaf node of the acceleration data structure and includes a first quantity of geometric primitives of the acceleration data structure; the second stored node comprises an internal node of the acceleration data structure and includes a second quantity of bounding volumes of the acceleration data structure; and the first stored node is cache line-aligned with a first cache line of the cache associated with the ray tracing processor and the second stored node is cache line-aligned with a second cache line of the cache associated with the ray tracing processor. 5. The method of claim 4 , wherein the second quantity is twice as large as the first quantity. 6. The method of claim 1 , wherein the ray-node intersection logic unit of the ray tracing processor further includes two or more ray-triangle logic units, and wherein the ray-node intersection logic unit uses the two or more ray-triangle logic units and the first configuration of the shared floating point ALU to determine the two or more ray-triangle intersections. 7. The method of claim 1 , wherein the ray-node intersection logic unit of the ray tracing processor further includes four or more ray-bounding volume logic units, and wherein the ray-node intersection logic unit uses the four or more ray-bounding volume logic units and the second configuration of the shared floating point ALU to determine the four or more ray-bounding volume intersections. 8. The method of claim 1 , wherein: the first stored node is a bounding volume hierarchy (BVH) node associated with two or more triangles corresponding to the two or more ray-triangle intersections; and the two or more triangles are stored in the BVH node. 9. The method of claim 8 , wherein the BVH node stores the two or more triangles as respective sets of coordinates associated with vertices of the two or more triangles. 10. The method of claim 1 , wherein the cache associated with the ray tracing processor is a graphics processing unit (GPU) cache. 11. The method of claim 1 , wherein the cache associated with the ray tracing processor is a level 0 (L0) cache of the ray tracing processor. 12. The method of claim 1 , wherein: a number of bytes included in each node of the one or more nodes is equal to the constant number of bytes; and each node of the one or more nodes is cache line-aligned with the cache associated with ray tracing processor based on the constant number of bytes being equal to a number of bytes included in a cache line of the cache associated with the ray tracing processor. 13. The method of claim 12 , wherein each node of the one or more nodes is 64 bytes. 14. The method of claim 1 , wherein the ray tracing processor is a ray tracing unit (RTU). 15. An apparatus for ray tracing, comprising: a memory; and one or more processors coupled to the memory, the one or more processors configured to: obtain one or more nodes of an acceleration data structure wherein each node of the one or more nodes is a constant-sized node including a constant number of bytes; store the one or more nodes in a cache associated with a ray tracing processor, wherein each of the one or more stored nodes are cache line-aligned with the cache associated with the ray tracing processor; provide a first stored node of the one or more stored nodes for processing by a ray-node intersection logic unit of the ray tracing processor, wherein the ray-node intersection logic unit includes a shared floating point arithmetic logic unit (ALU), and wherein the ray-node intersection logic unit uses a first configuration of the shared floating point ALU to determine two or more ray-triangle intersections corresponding to the first stored node within a first clock cycle of the ray tracing processor; and provide a second stored node of the one or more stored nodes for processing by the ray-node intersection logic unit of the ray tracing processor, wherein the ray-node intersection logic unit uses a second configuration of the shared floating point ALU to determine four or more ray-bounding volume intersections corresponding to the second stored node within a second clock cycle of the ray tracing processor, wherein the first clock cycle and the second clock cycle are consecutive clock cycles. 16. The apparatus of claim 15 , wherein the one or more processors are configured to: use the ray-node intersection logic unit to determine the two or more ray-triangle intersections based on two or more triangles included in the first stored node; and use the ray-node intersection logic unit to determine the four or more ray-bounding volume intersections based on four or more bounding volumes included in the second stored node. 17. The apparatus of claim 16 , wherein: the two or more ray-triangle intersections based on the first stored node are determined during the first clock cycle of the ray tracing processor; and the four or more ray-bounding volume intersections based on the second stored node are determined during the second clock cycle of the ray tracing processor. 18. The apparatus of claim 15 , wherein: the first stored node comprises a leaf node of the acceleration data structure and includes a first quantity of geometric primitives of the acceleration data structure;

Assignees

Inventors

Classifications

  • Processor architectures; Processor configuration, e.g. pipelining · CPC title

  • General purpose rendering architectures · CPC title

  • G06T15/06Primary

    Ray-tracing · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12045928B2 cover?
Systems and techniques are provided for enhancing operations of a ray tracing processor. For instance, a process can include obtaining one or more nodes of an acceleration data structure. Each node of the one or more nodes includes the same number of bytes. The node(s) can be stored in a cache associated with a ray tracing processor. Each of the stored node(s) are cache line-aligned with the ca…
Who is the assignee on this patent?
Qualcomm Inc
What technology area does this patent fall under?
Primary CPC classification G06T15/06. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 23 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).