Start virtual execution instruction for dispatching multiple threads in a computer
US-9223574-B2 · Dec 29, 2015 · US
US10409763B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10409763-B2 |
| Application number | US-201414319265-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 30, 2014 |
| Priority date | Jun 30, 2014 |
| Publication date | Sep 10, 2019 |
| Grant date | Sep 10, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Various different embodiments of the invention are described including: (1) a method and apparatus for intelligently allocating threads within a binary translation system; (2) data cache way prediction guided by binary translation code morphing software; (3) fast interpreter hardware support on the data-side; (4) out-of-order retirement; (5) decoupled load retirement in an atomic OOO processor; (6) handling transactional and atomic memory in an out-of-order binary translation based processor; and (7) speculative memory management in a binary translation based out of order processor.
Opening claim text (preview).
What is claimed is: 1. An apparatus comprising: a processor having a plurality of cores to execute a binary translation system comprising a plurality of components; thread scheduling logic to schedule the plurality of components to ensure that the plurality of components execute together on a single simultaneous multi-threaded (SMT) core, wherein the thread scheduling logic is to schedule a first code fragment for binary translation substantially concurrently with the scheduling of a second code fragment for execution through a translation-execution thread, wherein the second code fragment was translated through a binary translator thread or interpreted through an interpreter thread, wherein the first code segment and the translated or interpreted second code fragment are stored in a common cache slice, wherein each of the binary translator thread, the interpreter thread, and the translation-execution thread is executed on a separate logical core of the single SMT core, and the execution of the plurality of components together in the single SMT core is performed based on setting a processor affinity field of the binary translator thread, the interpreter thread, and the translation-execution thread, and wherein one or more translation lookaside buffer (TLB) entries or translation T-bit-agent (XTBA) entries are shared by two or more of the binary translator thread, the interpreter thread, and the translation-execution thread. 2. The apparatus as in claim 1 wherein the binary translator thread, the interpreter thread, and the translation-execution thread are to be scheduled on the single SMT core and wherein common source code streams associated with successive binary translator, interpreter, and translation-execution phases of a binary translation system index to a common set of physical memory pages. 3. The apparatus as in claim 1 wherein the binary translator thread, the interpreter thread, and the translation-execution thread are to be preempted and/or context-switched according to thread scheduling activity implemented by the thread scheduling logic. 4. The apparatus as in claim 1 wherein the binary translation system comprises a code morphing system (CMS). 5. The apparatus as in claim 1 , wherein the thread scheduling logic is further to schedule the plurality of components to ensure that the plurality of components execute together on multiple cores sharing a common mid-level cache (MLC). 6. The apparatus as in claim 1 , wherein the first code fragment and the second code fragment are consecutive fragments of instructions. 7. The apparatus as in claim 1 , wherein the binary translator thread and the interpreter thread run on physical addresses without using virtual and physical address translation. 8. The apparatus as in claim 1 , wherein the binary translator thread, the interpreter thread, and the translation-execution thread share a single pre-fetcher.
for non-native instruction set, e.g. Javabyte, legacy code · CPC title
Value prediction for operands; operand history buffers · CPC title
Involving translation to a different instruction set architecture, e.g. just-in-time translation in a JVM · CPC title
Multiprogramming arrangements · CPC title
Operand accessing · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.