Customer informed composable core matrix for sustainable service levels
US-12366985-B1 · Jul 22, 2025 · US
US12547413B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12547413-B2 |
| Application number | US-202418604201-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 13, 2024 |
| Priority date | Mar 13, 2024 |
| Publication date | Feb 10, 2026 |
| Grant date | Feb 10, 2026 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A first set of threads having a same address corresponding to the shared memory is identified from a group of active threads associated with an instruction to update a shared memory. A first thread of the first set of threads is selected. The instruction is executed for the first thread using the same address to access the shared memory. Attempts to execute the instruction for remaining threads of the first set of threads are delayed until after the first thread is executed and until at least one of the remaining threads of the first set of threads is not guaranteed to fail execution of the instruction.
Opening claim text (preview).
What is claimed is: 1 . A system comprising: a shared memory; and one or more processing units coupled with the shared memory, wherein the one or more processing units are to: identify, from a group of active threads associated with an instruction to update the shared memory, a first set of threads having a same address corresponding to the shared memory; select a first thread of the first set of threads; execute the instruction for the first thread using the same address to access the shared memory; and store a Boolean value in one or more predicate registers corresponding to remaining threads of the first set of threads to prevent executing the instruction for the remaining threads until after the first thread is executed, wherein the Boolean value indicates that the remaining threads failed to execute the instruction. 2 . The system of claim 1 , wherein the one or more processing units are further to: responsive to the execution of the instruction for the first thread, store a Boolean value in a predicate register corresponding to the first thread, wherein the Boolean value indicates whether the first thread successfully executed the instruction. 3 . The system of claim 1 , wherein the instruction is a compare-and-store (CAST) instruction, and wherein to execute the compare and store instruction, the one or more processing units are to: compare a first value stored at the same memory address of the shared memory with an expected value; and responsive to a determination that the first value matches the expected value, write a second value to the shared memory at the same memory address. 4 . The system of claim 3 , wherein the one or more processing units are further to: write the second value to one or more private registers corresponding to the first set of threads. 5 . The system of claim 4 , wherein the one or more processing units are further to: subsequent to the execution of the instruction for the first thread, execute the CAST instruction for a second thread of the remaining threads of the first set of threads using the second value stored in a respective private register of the private registers. 6 . The system of claim 1 , wherein the shared memory comprises a plurality of logical units, and wherein the one or more processing units are further to: serially execute the instruction for threads from the group of active threads with different addresses corresponding to a same logical unit of the plurality of logical units of the shared memory. 7 . The system of claim 1 , wherein the system is comprised in at least one of: a control system for an autonomous or semi-autonomous machine; a perception system for an autonomous or semi-autonomous machine; a system for performing simulation operations; a system for performing digital twin operations; a system for performing light transport simulation; a system for performing collaborative content creation for 3D assets; a system for performing deep learning operations; a system for generating or presenting at least one of augmented reality content, virtual reality content, or mixed reality content; a system for hosting one or more real-time streaming applications; a system implemented using an edge device; a system implemented using a robot; a system for performing conversational AI operations; a system for generating synthetic data; a system incorporating one or more virtual machines (VMs); a system implemented at least partially in a data center; or a system implemented at least partially using cloud computing resources. 8 . A method comprising: identifying, from a group of active threads associated with an instruction to update a shared memory, a first set of threads having a same address corresponding to the shared memory; selecting a first thread of the first set of threads; executing the instruction for the first thread using the same address to access the shared memory; and storing a Boolean value in one or more predicate registers corresponding to remaining threads of the first set of threads to prevent executing the instruction for the remaining threads until after the first thread is executed and until at least one of the remaining threads of the first set of threads is not guaranteed to fail execution of the instruction, wherein the Boolean value indicates that the remaining threads failed to execute the instruction. 9 . The method of claim 8 , further comprising: responsive to executing the instruction for the first thread, storing a Boolean value in a predicate register corresponding to the first thread, wherein the Boolean value indicates whether the first thread successfully executed the instruction. 10 . The method of claim 8 , wherein the instruction is a compare-and-store (CAST) instruction, and wherein executing the compare and store instruction comprises: comparing a first value stored at the same memory address of the shared memory with an expected value; and responsive to determining that the first value matches the expected value, writing a second value to the shared memory at the same memory address. 11 . The method of claim 10 , further comprising: writing the second value to one or more private registers corresponding to the first set of threads. 12 . The method of claim 11 , further comprising: subsequent to executing the instruction for the first thread, executing the CAST instruction for a second thread of the remaining threads of the first set of threads using the second value stored in a respective private register of the one or more private registers. 13 . The method of claim 8 , wherein the shared memory comprises a plurality of logical units, and wherein the method further comprises: serially executing the instruction for threads from the group of active threads with different addresses corresponding to a same logical unit of the plurality of logical units of the shared memory. 14 . A parallel processing unit (PPU) comprising one or more execution units and a shared memory, wherein the PPU is to: identify, from a group of active threads associated with an instruction to update the shared memory, a first set of threads having a same address corresponding to the shared memory; select a first thread of the first set of threads; execute the instruction for the first thread on the one or more execution units using the same address to access the shared memory; and store a Boolean value in one or more predicate registers corresponding to remaining threads of the first set of threads to prevent executing the instruction for the remaining threads until after the first thread is executed, wherein the Boolean value indicates that the remaining threads failed to execute the instruction. 15 . The PPU of claim 14 , wherein the PPU is further to: responsive to the execution of the instruction for the first thread, store a Boolean value in a predicate register corresponding to the first thread, wherein the Boolean value indicates whether the first thread successfully executed the instruction. 16 . The PPU of claim 14 , wherein the instruction is a compare-and-store (CAST) instruction, and wherein to execute the compare and store instruction, the PPU is to: compare a first value stored at the same memory address of the shared memory with an expected value; and responsive to a determination that the first value matches the expected value, write a second value to the shared memory at the same memory address. 17 . The PPU of claim 16 , wherein the PPU is further to: write the second value to one or mor
LOAD or STORE instructions; Clear instruction · CPC title
Divergence aspects · CPC title
Iterative single instructions for multiple data lanes [SIMD] · CPC title
Compare instructions, e.g. Greater-Than, Equal-To, MINMAX · CPC title
Maintaining memory consistency · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.