Methods and apparatus to perform instruction-level graphics processing unit (gpu) profiling based on binary instrumentation
US-2019034316-A1 · Jan 31, 2019 · US
US12288068B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12288068-B2 |
| Application number | US-202318465189-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 12, 2023 |
| Priority date | Dec 29, 2020 |
| Publication date | Apr 29, 2025 |
| Grant date | Apr 29, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An instruction simulation device and a method thereof are provided. The instruction simulation device includes a processor. The processor includes an instruction decoder which generates format information of a ready-for-execution instruction. The processor determines whether the ready-for-execution instruction currently executed by the processor is a compatible instruction or an extended instruction based on the format information of the ready-for-execution instruction. If the ready-for-execution instruction is an extended instruction under the new instruction set or the extended instruction set, the processor converts the ready-for-execution instruction into a simulation program corresponding to the extended instruction, and simulates an execution result of the ready-for-execution instruction by executing the simulation program. The simulation program is composed of at least one compatible instructions of the processor. If the ready-for-execution instruction is a compatible instruction, the processor executes the ready-for-execution instruction.
Opening claim text (preview).
What is claimed is: 1. An instruction simulation device, comprising a processor, the processor comprising: an instruction decoder, configured to generate format information of a ready-for-execution instruction; and the processor is configured to determine whether the ready-for-execution instruction currently executed by the processor is a compatible instruction or an extended instruction based on the format information of the ready-for-execution instruction, wherein the compatible instruction is an instruction under a current instruction set of the processor, and the extended instruction is not an instruction under the current instruction set of the processor but is an instruction under a new instruction set or an extended instruction set, wherein the new instruction set and the extended instruction set are instruction sets that do not belong to a native instruction set of the processor; wherein if the ready-for-execution instruction is an extended instruction under the new instruction set or the extended instruction set, the processor converts the ready-for-execution instruction into a simulation program corresponding to the extended instruction, and simulates an execution result of the ready-for-execution instruction by executing the simulation program; and if the ready-for-execution instruction is a compatible instruction, the processor executes the ready-for-execution instruction; wherein the simulation program is composed of at least one compatible instructions of the processor. 2. The instruction simulation device according to claim 1 , wherein a computer system embodied with the processor comprises a system memory, the system memory comprising: a processor-current-state store region, configured to store a current-context state of the processor; a conversion-information store region, configured to store temporary information in a process of translating the ready-for-execution instruction into the corresponding simulation program; and an execution-result store region, configured to store the execution result after executing the simulation program. 3. The instruction simulation device according to claim 1 , wherein when the ready-for-execution instruction is the extended instruction, the processor asserts an emulation flag to obtain the corresponding simulation program by means of an interrupt service program. 4. The instruction simulation device according to claim 1 , wherein the processor comprises: a plurality of registers, comprising a register configured to indicate an address of a storage space caching a current state of the processor, a register configured to indicate an address of a storage space caching a conversion intermediate result when calling the corresponding simulation program, a register configured to indicate an address of a storage space caching a simulation execution result, a simulation register configured to map a target register indicated by the extended instruction, and a register configured to cache an address indicative of a real-time conversion mode state store region. 5. The instruction simulation device according to claim 1 , wherein an interrupt service program calls a simulation module to query if there is a simulation program corresponding to the extended instruction, wherein if the simulation program corresponding to the extended instruction is found, the simulation module executes the simulation program and obtains a simulation execution result for simulating the execution result of the ready-for-execution instruction. 6. The instruction simulation device according to claim 5 , wherein the simulation execution result is reserved after terminating the calling to the simulation module, and when the simulation program corresponding to the extended instruction is not found, a failure result is returned to notify the processor and the calling to the simulation module is terminated in responsive to the failure result. 7. The instruction simulation device according to claim 6 , wherein after terminating the calling to the simulation module, the processor reads the reserved simulation execution result or receives the failure result, wherein the processor notifies an application program arising the ready-for-execution instruction of the failure result. 8. The instruction simulation device according to claim 6 , wherein if a subsequent ready-for-execution instruction is an extended instruction and the subsequent ready-for-execution instruction is converted into a simulation program corresponding to the subsequent ready-for-execution instruction, the reserved simulation execution result serves as a reference when executing the simulation program corresponding to the subsequent ready-for-execution instruction. 9. The instruction simulation device according to claim 5 , wherein the simulation module is embodied in a processor driver, in a kernel of an operating system running on the processor, or stored in a basic input/output system of a computer system embodied with the processor. 10. The instruction simulation device according to claim 1 , wherein the compatible instruction and the extended instruction are both instructions under an x86 instruction set architecture or a reduced instruction set computer (RISC) architecture. 11. An instruction simulation method, performed by a processor including an instruction decoder, the instruction simulation method comprising: using the instruction decoder of the processor to generate format information of a ready-for-execution instruction; determining by the processor whether the ready-for-execution instruction currently executed by the processor is a compatible instruction or an extended instruction based on the format information of the ready-for-execution instruction, wherein the compatible instruction is an instruction under a current instruction set of the processor, and the extended instruction is not an instruction under the current instruction set of the processor, but is an instruction under a new instruction set or an extended instruction set, wherein the new instruction set and the extended instruction set are instruction sets that do not belong to a native instruction set of the processor; translating the ready-for-execution instruction into a simulation program corresponding to the extended instruction wherein an execution result of the ready-for-execution instruction is generated by means of a simulation execution result generated by the simulation program if the read-for-execution instruction is an extended instruction under the new instruction set or the extended instruction set; and executing the ready-for-execution instruction by the processor if the read-for-execution instruction is a compatible instruction; wherein the simulation program is composed of at least one compatible instruction of the processor. 12. The instruction simulation method according to claim 11 , wherein a computer system embodied the processor comprises a system memory, the system memory comprising: a processor-current-state store region, configured to store a current context state of the processor; a conversion-information store region, configured to store temporary information in a process of translating the ready-for-execution instruction into the corresponding simulation program; and an execution-result store region, configured to store the simulation execution result after executing the simulation program. 13. The instruction simulation method according to claim 11 , wherein when the ready-for-execution instruction is the extended instruction, the processor asserts an emulation flag to obtain the corresponding simulation program by means of an interrupt service program.
Instruction completion, e.g. retiring, committing or graduating · CPC title
Result writeback, i.e. updating the architectural state or memory · CPC title
Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines · CPC title
for non-native instruction set, e.g. Javabyte, legacy code · CPC title
Runtime code conversion or optimisation · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.