Hardware apparatuses and methods to fuse instructions

US10324724B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10324724-B2
Application numberUS-201514971904-A
CountryUS
Kind codeB2
Filing dateDec 16, 2015
Priority dateDec 16, 2015
Publication dateJun 18, 2019
Grant dateJun 18, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and apparatuses relating to a fusion manager to fuse instructions are described. In one embodiment, a hardware processor includes a hardware binary translator to translate an instruction stream into a translated instruction stream, a hardware fusion manager to fuse multiple instructions of the translated instruction stream into a single fused instruction, a hardware decode unit to decode the single fused instruction into a decoded, single fused instruction, and a hardware execution unit to execute the decoded, single fused instruction.

First claim

Opening claim text (preview).

What is claimed is: 1. A hardware processor comprising: a hardware binary translator to translate an instruction stream into a translated instruction stream; a hardware fusion manager to fuse multiple instructions of the translated instruction stream into a single fused instruction, the hardware fusion manager to: detect a load instruction and an instruction that is to read a result of the load instruction in the translated instruction stream, fuse the load instruction and the instruction that is to read the result of the load instruction into the single fused instruction when the load instruction is determined to be a zero extending load instruction, and not fuse the load instruction and the instruction that is to read the result of the load instruction into the single fused instruction when the load instruction is determined to not be a zero extending load instruction; a hardware decode unit to decode the single fused instruction into a decoded, single fused instruction; and a hardware execution unit to execute the decoded, single fused instruction. 2. The hardware processor of claim 1 , wherein the hardware fusion manager is to not fuse the zero extending load instruction and the instruction that is to read the result of the zero extending load instruction into the single fused instruction unless a later instruction that is to overwrite the result of the zero extending load instruction is detected. 3. The hardware processor of claim 1 , wherein the hardware fusion manager is to not fuse the zero extending load instruction and the instruction that is to read the result of the zero extending load instruction if the hardware fusion manager detects a control flow instruction therebetween. 4. The hardware processor of claim 1 , wherein the hardware fusion manager is to: detect, in the translated instruction stream, an instruction that is to produce a result and a store instruction that is to read the result, and fuse the instruction that is to produce the result and the store instruction that is to read the result into the single fused instruction. 5. The hardware processor of claim 4 , wherein the hardware fusion manager is to not fuse the instruction that is to produce the result and the store instruction that is to read the result if the hardware fusion manager detects any instruction of the translated instruction stream between the instruction that is to produce the result and the store instruction that is to read the result that is also to read the result. 6. The hardware processor of claim 4 , wherein the hardware fusion manager is to not fuse the instruction that is to produce the result and the store instruction that is to read the result if the hardware fusion manager detects: any instruction of the translated instruction stream that is also to read the result between the instruction that is to produce the result and the store instruction that is to read the result, and the single fused instruction is to overwrite the result. 7. The hardware processor of claim 1 , wherein the instruction stream is a stream of macro-instructions. 8. The hardware processor of claim 4 , wherein the hardware fusion manager is to fuse the instruction that is to produce the result and the store instruction that is to read the result if: the hardware fusion manager detects any instruction of the translated instruction stream therebetween that is also to read the result, and the hardware fusion manager relocates the any instruction to be after the store instruction in the translated instruction stream. 9. A method comprising: translating an instruction stream into a translated instruction stream with a binary translator; fusing multiple instructions of the translated instruction stream into a single fused instruction with a fusion manager, the fusing comprising: detecting a load instruction and an instruction that is to read a result of the load instruction in the translated instruction stream, fusing the load instruction and the instruction that is to read the result of the load instruction into the single fused instruction when the load instruction is a zero extending load instruction, and not fusing the load instruction and the instruction that is to read the result of the load instruction into the single fused instruction when the load instruction is determined to not be a zero extending load instruction; decoding the single fused instruction into a decoded, single fused instruction with a hardware decode unit of a hardware processor; and executing the decoded, single fused instruction with a hardware execution unit of the hardware processor. 10. The method of claim 9 , further comprising not fusing the zero extending load instruction and the instruction that is to read the result of the zero extending load instruction into the single fused instruction unless a later instruction that is to overwrite the result of the zero extending load instruction is detected. 11. The method of claim 9 , further comprising not fusing the zero extending load instruction and the instruction that is to read the result of the zero extending load instruction if the fusion manager detects a control flow instruction therebetween. 12. The method of claim 9 , wherein the fusing comprises: detecting, in the translated instruction stream, an instruction that is to produce a result and a store instruction that is to read the result, and fusing the instruction that is to produce the result and the store instruction that is to read the result into the single fused instruction. 13. The method of claim 12 , further comprising not fusing the instruction that is to produce the result and the store instruction that is to read the result if the fusion manager detects any instruction of the translated instruction stream between the instruction that is to produce the result and the store instruction that is to read the result that is also to read the result. 14. The method of claim 12 , further comprising not fusing the instruction that is to produce the result and the store instruction that is to read the result if the fusion manager detects: any instruction of the translated instruction stream that is also to read the result between the instruction that is to produce the result and the store instruction that is to read the result, and the single fused instruction is to overwrite the result. 15. The method of claim 9 , wherein the instruction stream is a stream of macro-instructions. 16. The method of claim 12 , wherein the fusing comprises fusing the instruction that is to produce the result and the store instruction that is to read the result if: the fusion manager detects any instruction of the translated instruction stream therebetween that is also to read the result, and the fusion manager relocates the any instruction to be after the store instruction in the translated instruction stream. 17. A non-transitory machine readable medium that stores code that when executed by a machine causes the machine to perform a method comprising: translating an instruction stream into a translated instruction stream with a binary translator; fusing multiple instructions of the translated instruction stream into a single fused instruction with a fusion manager, the fusing comprising: detecting a load instruction and an instruction that is to read a result of the load instruction in the translated instruction stream, fusing the load instruction and the instruction that is to read the result of the load instruction into the single fused instruction when the load instruction is a zero extending load inst

Assignees

Inventors

Classifications

  • Arrangements for executing machine instructions, e.g. instruction decode (for executing microinstructions G06F9/22) · CPC title

  • using decoder, e.g. decoder per instruction set, adaptable or programmable decoders · CPC title

  • Decoding the operand specifier, e.g. specifier format · CPC title

  • G06F9/3017Primary

    Runtime instruction translation, e.g. macros · CPC title

  • Instruction operation extension or modification · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10324724B2 cover?
Methods and apparatuses relating to a fusion manager to fuse instructions are described. In one embodiment, a hardware processor includes a hardware binary translator to translate an instruction stream into a translated instruction stream, a hardware fusion manager to fuse multiple instructions of the translated instruction stream into a single fused instruction, a hardware decode unit to decod…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06F9/3017. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 18 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).