Composable kernels
US-2024256241-A1 · Aug 1, 2024 · US
US8959277B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-8959277-B2 |
| Application number | US-33431608-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 12, 2008 |
| Priority date | Dec 12, 2008 |
| Publication date | Feb 17, 2015 |
| Grant date | Feb 17, 2015 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
One embodiment of the present invention provides a system that facilitates precise exception semantics for a virtual machine. During operation, the system executes a program in the virtual machine using a processor that includes a gated store buffer that stores values to be written to a memory. This gated store buffer is configured to delay a store to the memory until after a speculatively-optimized region of the program commits. The processor signals an exception when it detects that a load following the store is attempting to access the same memory region being written by the store prior to the commitment of the speculatively-optimized region.
Opening claim text (preview).
What is claimed is: 1. A computing device that facilitates providing precise exception semantics for a virtual machine, wherein the computing device comprises a processor configured to: receive native instructions for a program, wherein the native instructions correspond to a native instruction set architecture (ISA) for a processor and correspond to virtual instructions for a virtual ISA that is different from the native ISA; receive a store instruction in the native instructions that writes a value to a memory; delay writing the value to the memory until after a speculatively-optimized region of the program commits by writing the value to a gated store buffer; determine whether a load instruction in the native instructions that follows the store instruction attempts to access, prior to the speculatively-optimized region committing, a memory address that is aligned with a byte-width of the value, the memory address in a region of the memory accessed by the store instruction; when the load instruction attempts to access a memory address that is not aligned with the byte-width, roll back execution of the program to a preceding point in the program by signaling an exception, wherein, while rolling back the execution, the processor is configured to use a virtual machine to re-execute the speculatively-optimized region by using the virtual machine to execute instructions in the virtual instructions that correspond to native instructions in the speculatively-optimized region; and otherwise, when the load instruction attempts to access a memory address that is aligned with the byte-width, use a bypass mechanism for the gated store buffer to provide the value to the load instruction instead of performing the rolling back. 2. The computing device of claim 1 , wherein, upon determining that the gated store buffer has detected that a load following the store is attempting to access a same memory region being written by the store prior to the commitment of the speculatively-optimized region, the processor is configured to flush contents of the gated store buffer. 3. The computing device of claim 1 , wherein rolling back program execution to the preceding point in the program involves: restoring a virtual state associated with a preceding safepoint; and restoring a state associated with a preceding checkpoint. 4. The computing device of claim 1 , wherein signaling the exception facilitates avoiding deadlock without needing to include bypass hardware in the processor that retrieves one or more values from the gated store buffer for the load. 5. The computing device of claim 2 , wherein after signaling the exception, the processor, an operating system for the processor, an optimizing compiler, and/or the virtual machine are further configured to: add an additional safepoint or checkpoint after the store but previous to the load to ensure that a value associated with the store is written to memory prior to the load. 6. The computing device of claim 1 , wherein the gated store buffer includes values for both uncommitted stores and committed stores that have not yet been written to the memory; and wherein the processor is further configured to not raise the exception when the load accesses a value associated with a committed but unwritten store. 7. The computing device of claim 1 , wherein the gated store buffer is configured to perform a conservative and/or an alternative comparison between a memory region accessed by the load and a memory region accesses by the store to determine whether the two operations access a same memory region. 8. The computing device of claim 7 , wherein the conservative and/or alternative comparison involves one or more of the following: comparing a subset of the physical address bits for the memory region accessed by the load and the memory region accessed by the store; and using an alternative alias-detection mechanism to determine whether the gated store buffer may contain a value for the memory region being accessed by the load. 9. A method that facilitates providing precise exception semantics for a virtual machine, the method comprising: receiving native instructions for a program, wherein the native instructions correspond to a native instruction set architecture (ISA) for a processor and correspond to virtual instructions for a virtual ISA that is different from the native ISA; receiving a store instruction in the native instructions that writes a value to a memory; delaying writing the value to the memory until after a speculatively-optimized region of the program commits by writing the value to a gated store buffer; determining whether a load instruction in the native instructions that follows the store instruction attempts to access, prior to the speculatively-optimized region committing, a memory address that is aligned with a byte-width of the value, the memory address in a region of the memory accessed by the store instruction; when the load instruction attempts to access a memory address that is not aligned with the byte-width, rolling back execution of the program to a preceding point in the program by signaling an exception, wherein the rolling back comprises using a virtual machine to re-execute the speculatively-optimized region by using the virtual machine to execute instructions in the virtual instructions that correspond to native instructions in the speculatively-optimized region; and otherwise, when the load instruction attempts to access a memory address that is aligned with the byte-width, using a bypass mechanism for the gated store buffer to provide the value to the load instruction instead of performing the rolling back. 10. The method of claim 9 , wherein, upon determining that the gated store buffer has detected that a load following the store is attempting to access a same memory region being written by the store prior to the commitment of the speculatively-optimized region, the method further comprises flushing contents of the gated store buffer. 11. The method of claim 10 , wherein, after signaling the exception, the method further comprises: adding an additional safepoint or checkpoint after the store but previous to the load to ensure that a value associated with the store is written to memory prior to the load. 12. The method of claim 9 , wherein rolling back program execution to the preceding point in the program involves one or more of the following: restoring a virtual state associated with a preceding safepoint; and restoring a state associated with a preceding checkpoint. 13. The method of claim 9 , wherein signaling the exception facilitates avoiding deadlock without needing to include bypass hardware in a processor to retrieve one or more values from the gated store buffer for the load. 14. The method of claim 9 , wherein the method further involves performing a conservative and/or an alternative comparison between a memory region accessed by the load and a memory region accessed by the store to determine whether the two operations access a same memory region. 15. The method of claim 14 , wherein the conservative and/or alternative comparison involves one or more of the following: comparing a subset of the physical address bits for the memory region accessed by the load and the memory region accessed by the store; and using an alternative alias-detection mechanism to determine whether the gated store buffer may contain a value for the memory region being accessed by the load. 16. The method of claim 9 , wherein the determining that the load attempts to access the region involves performing an operation, and wherei
Instruction completion, e.g. retiring, committing or graduating · CPC title
Result writeback, i.e. updating the architectural state or memory · CPC title
Register renaming · CPC title
using multiple copies of the architectural state, e.g. shadow registers · CPC title
Synchronisation or serialisation instructions · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.