Instruction fusion
US-2017123808-A1 · May 4, 2017 · US
US10324724B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10324724-B2 |
| Application number | US-201514971904-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 16, 2015 |
| Priority date | Dec 16, 2015 |
| Publication date | Jun 18, 2019 |
| Grant date | Jun 18, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods and apparatuses relating to a fusion manager to fuse instructions are described. In one embodiment, a hardware processor includes a hardware binary translator to translate an instruction stream into a translated instruction stream, a hardware fusion manager to fuse multiple instructions of the translated instruction stream into a single fused instruction, a hardware decode unit to decode the single fused instruction into a decoded, single fused instruction, and a hardware execution unit to execute the decoded, single fused instruction.
Opening claim text (preview).
What is claimed is: 1. A hardware processor comprising: a hardware binary translator to translate an instruction stream into a translated instruction stream; a hardware fusion manager to fuse multiple instructions of the translated instruction stream into a single fused instruction, the hardware fusion manager to: detect a load instruction and an instruction that is to read a result of the load instruction in the translated instruction stream, fuse the load instruction and the instruction that is to read the result of the load instruction into the single fused instruction when the load instruction is determined to be a zero extending load instruction, and not fuse the load instruction and the instruction that is to read the result of the load instruction into the single fused instruction when the load instruction is determined to not be a zero extending load instruction; a hardware decode unit to decode the single fused instruction into a decoded, single fused instruction; and a hardware execution unit to execute the decoded, single fused instruction. 2. The hardware processor of claim 1 , wherein the hardware fusion manager is to not fuse the zero extending load instruction and the instruction that is to read the result of the zero extending load instruction into the single fused instruction unless a later instruction that is to overwrite the result of the zero extending load instruction is detected. 3. The hardware processor of claim 1 , wherein the hardware fusion manager is to not fuse the zero extending load instruction and the instruction that is to read the result of the zero extending load instruction if the hardware fusion manager detects a control flow instruction therebetween. 4. The hardware processor of claim 1 , wherein the hardware fusion manager is to: detect, in the translated instruction stream, an instruction that is to produce a result and a store instruction that is to read the result, and fuse the instruction that is to produce the result and the store instruction that is to read the result into the single fused instruction. 5. The hardware processor of claim 4 , wherein the hardware fusion manager is to not fuse the instruction that is to produce the result and the store instruction that is to read the result if the hardware fusion manager detects any instruction of the translated instruction stream between the instruction that is to produce the result and the store instruction that is to read the result that is also to read the result. 6. The hardware processor of claim 4 , wherein the hardware fusion manager is to not fuse the instruction that is to produce the result and the store instruction that is to read the result if the hardware fusion manager detects: any instruction of the translated instruction stream that is also to read the result between the instruction that is to produce the result and the store instruction that is to read the result, and the single fused instruction is to overwrite the result. 7. The hardware processor of claim 1 , wherein the instruction stream is a stream of macro-instructions. 8. The hardware processor of claim 4 , wherein the hardware fusion manager is to fuse the instruction that is to produce the result and the store instruction that is to read the result if: the hardware fusion manager detects any instruction of the translated instruction stream therebetween that is also to read the result, and the hardware fusion manager relocates the any instruction to be after the store instruction in the translated instruction stream. 9. A method comprising: translating an instruction stream into a translated instruction stream with a binary translator; fusing multiple instructions of the translated instruction stream into a single fused instruction with a fusion manager, the fusing comprising: detecting a load instruction and an instruction that is to read a result of the load instruction in the translated instruction stream, fusing the load instruction and the instruction that is to read the result of the load instruction into the single fused instruction when the load instruction is a zero extending load instruction, and not fusing the load instruction and the instruction that is to read the result of the load instruction into the single fused instruction when the load instruction is determined to not be a zero extending load instruction; decoding the single fused instruction into a decoded, single fused instruction with a hardware decode unit of a hardware processor; and executing the decoded, single fused instruction with a hardware execution unit of the hardware processor. 10. The method of claim 9 , further comprising not fusing the zero extending load instruction and the instruction that is to read the result of the zero extending load instruction into the single fused instruction unless a later instruction that is to overwrite the result of the zero extending load instruction is detected. 11. The method of claim 9 , further comprising not fusing the zero extending load instruction and the instruction that is to read the result of the zero extending load instruction if the fusion manager detects a control flow instruction therebetween. 12. The method of claim 9 , wherein the fusing comprises: detecting, in the translated instruction stream, an instruction that is to produce a result and a store instruction that is to read the result, and fusing the instruction that is to produce the result and the store instruction that is to read the result into the single fused instruction. 13. The method of claim 12 , further comprising not fusing the instruction that is to produce the result and the store instruction that is to read the result if the fusion manager detects any instruction of the translated instruction stream between the instruction that is to produce the result and the store instruction that is to read the result that is also to read the result. 14. The method of claim 12 , further comprising not fusing the instruction that is to produce the result and the store instruction that is to read the result if the fusion manager detects: any instruction of the translated instruction stream that is also to read the result between the instruction that is to produce the result and the store instruction that is to read the result, and the single fused instruction is to overwrite the result. 15. The method of claim 9 , wherein the instruction stream is a stream of macro-instructions. 16. The method of claim 12 , wherein the fusing comprises fusing the instruction that is to produce the result and the store instruction that is to read the result if: the fusion manager detects any instruction of the translated instruction stream therebetween that is also to read the result, and the fusion manager relocates the any instruction to be after the store instruction in the translated instruction stream. 17. A non-transitory machine readable medium that stores code that when executed by a machine causes the machine to perform a method comprising: translating an instruction stream into a translated instruction stream with a binary translator; fusing multiple instructions of the translated instruction stream into a single fused instruction with a fusion manager, the fusing comprising: detecting a load instruction and an instruction that is to read a result of the load instruction in the translated instruction stream, fusing the load instruction and the instruction that is to read the result of the load instruction into the single fused instruction when the load instruction is a zero extending load inst
Arrangements for executing machine instructions, e.g. instruction decode (for executing microinstructions G06F9/22) · CPC title
using decoder, e.g. decoder per instruction set, adaptable or programmable decoders · CPC title
Decoding the operand specifier, e.g. specifier format · CPC title
Runtime instruction translation, e.g. macros · CPC title
Instruction operation extension or modification · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.