Instruction-optimizing processor with branch-count table in hardware

US10241810B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10241810-B2
Application numberUS-201213475755-A
CountryUS
Kind codeB2
Filing dateMay 18, 2012
Priority dateMay 18, 2012
Publication dateMar 26, 2019
Grant dateMar 26, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A processing system comprising a microprocessor core and a translator. Within the microprocessor core is arranged a hardware decoder configured to selectively decode instructions for execution in the microprocessor core, and, a logic structure configured to track usage of the hardware decoder. The translator is operatively coupled to the logic structure and configured to selectively translate the instructions for execution in the microprocessor core, based on the usage of the hardware decoder as determined by the logic structure.

First claim

Opening claim text (preview).

The invention claimed is: 1. A processing system comprising: a microprocessor core; a hardware decoder arranged within the microprocessor core and configured to selectively decode instructions for execution in the microprocessor core; a logic structure arranged within the microprocessor core and configured to track usage of the hardware decoder, wherein: the logic structure comprises a table comprising a plurality of registers, each respective register pre-loaded with an initial count value decremented each time a respective non-native code block has been uniquely decoded by the hardware decoder, the pre-loaded initial count value dependent on the respective non-native code block; and a first register in the plurality of registers is pre-loaded with a first initial count value and a second register in the plurality of registers is pre-loaded with a second initial count value, where the first initial count value and the second initial count value are different; and a translator operatively coupled to the logic structure and configured to selectively translate the respective non-native code block for execution in the microprocessor core when the respective register is decremented to zero. 2. The processing system of claim 1 wherein the translator comprises a dynamic binary translator. 3. The processing system of claim 1 wherein the translator is further configured to enact one or more of: selectively optimizing the instructions for speed of execution; selectively renaming a register of the microprocessor core; and selectively resequencing the instructions. 4. The processing system of claim 1 further comprising an execution unit configured to execute the instructions as translated by the translator and to execute the instructions as decoded by the hardware decoder, and wherein the instructions, when translated by the translator, are executed in the execution unit without further processing by the hardware decoder. 5. The processing system of claim 1 wherein each of the respective registers comprises an n-bit binary counter configured to decrement a-value of the respective register upon being read. 6. The processing system of claim 1 wherein a condition for invoking the translator includes underflow, overflow, or nulling of the value in the register. 7. The processing system of claim 1 wherein the instructions comprise a code block starting at a branch-target address, and wherein each of the respective registers is addressable through one or more hashed forms of the branch-target address. 8. The processing system of claim 7 wherein each of the respective registers is addressable for reading and writing through different hashed forms of the branch-target address. 9. The processing system of claim 1 wherein each of the respective registers is addressable through hashed forms of branch-target addresses of a corresponding plurality of code blocks. 10. The processing system of claim 9 wherein the microprocessor core comprises a control register to specify whether the logic structure is able to invoke the translator. 11. The processing system of claim 9 wherein the microprocessor core comprises a control register to specify whether all registers in the logic structure are to be invalidated and restored to their respective initial count values. 12. In a processing system having a microprocessor core, a hardware decoder arranged within the microprocessor core, and a translator, a method comprising: with the hardware decoder, decoding instructions for execution in the microprocessor core; in a logic structure arranged within the microprocessor core, tallying how many times the hardware decoder has decoded the instructions, wherein: the logic structure comprises a table comprising a plurality of registers, each respective register pre-loaded with an initial count value decremented each time a respective non-native code block has been uniquely decoded by the hardware decoder, the pre-loaded initial count value dependent on the respective non-native code block; and a first register in the plurality of registers is pre-loaded with a first initial count value and a second register in the plurality of registers is pre-loaded with a second initial count value, where the first initial count value and the second initial count value are different; translating and optimizing the respective non-native code block for execution in the microprocessor core when the respective register is decremented to zero; and storing the instructions as translated in a trace cache for execution by the processing system. 13. The method of claim 12 further comprising executing the instructions as translated without further processing by the hardware decoder. 14. The method of claim 12 wherein each of the non-native code blocks begin at a respective branch-target address and wherein each of the respective registers is addressable through one or more hashed forms of the respective branch-target address. 15. The method of claim 14 further comprising hashing the respective branch-target address to obtain an address for writing to the respective register. 16. The method of claim 14 further comprising decrementing a value of the respective register upon reading the respective register. 17. The method of claim 14 further comprising hashing the respective branch-target address to obtain an address for reading the respective register. 18. In a processing system having a microprocessor core, a hardware decoder arranged within the microprocessor core, and a translator, a method comprising: with the hardware decoder, decoding a non-native block of instruction code for execution in the microprocessor core, the non-native block of instruction code associated with a respective branch-target address; in a logic structure arranged within the microprocessor core, tallying a number of times that the hardware decoder decodes the non-native block of instruction code, the logic structure comprising a table comprising a plurality of registers addressable through one or more hashed forms of the branch-target address, wherein: each of the registers pre-loaded with an initial count value decremented each time the hardware decoder uniquely decodes a respective non-native block of instruction code, the pre-loaded initial count value dependent on the non-native code block of instruction code; and a first register in the plurality of registers is pre-loaded with a first initial count value and a second register in the plurality of registers is pre-loaded with a second initial count value, where the first initial count value and the second initial count value are different; raising an interrupt in the microprocessor when any of the plurality of registers underflows or holds a zero; in response to the interrupt being raised, translating and optimizing a respective non-native block of instruction code for execution via the translator; and storing the respective non-native block of instruction code, as translated, in a trace cache for subsequent execution by the processing system.

Assignees

Inventors

Classifications

  • Runtime code conversion or optimisation · CPC title

  • for non-native instruction set, e.g. Javabyte, legacy code · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10241810B2 cover?
A processing system comprising a microprocessor core and a translator. Within the microprocessor core is arranged a hardware decoder configured to selectively decode instructions for execution in the microprocessor core, and, a logic structure configured to track usage of the hardware decoder. The translator is operatively coupled to the logic structure and configured to selectively translate t…
Who is the assignee on this patent?
Brauch Rupert, Swarna Madhu, Segelken Ross, and 3 more
What technology area does this patent fall under?
Primary CPC classification G06F9/45516. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 26 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).