Streaming engine with separately selectable element and group duplication
US-11860790-B2 · Jan 2, 2024 · US
US9690591B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9690591-B2 |
| Application number | US-29039508-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 30, 2008 |
| Priority date | Oct 30, 2008 |
| Publication date | Jun 27, 2017 |
| Grant date | Jun 27, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A technique to enable efficient instruction fusion within a computer system is disclosed. In one embodiment, processor logic delays the processing of a first instruction for a threshold amount of time if the first instruction within an instruction queue is fusible with a second instruction.
Opening claim text (preview).
What is claimed is: 1. An apparatus comprising: an instruction queue (IQ); and logic to delay processing of a first fusible instruction for a threshold amount of time, such that a second fusible instruction, fusible with the first fusible instruction, may be fused with the first fusible instruction if the second fusible instruction is stored within the IQ within the threshold amount of time, wherein the logic includes a counter to be incremented once for each cycle after the first fusible instruction is stored in the IQ and is the last instruction in the IQ until a threshold number of cycles corresponding to the threshold amount of time is reached. 2. The apparatus of claim 1 , wherein the first fusible instruction and the second fusible instruction are stored across a fetch boundary prior to being stored in the IQ. 3. The apparatus of claim 1 , wherein the logic is to delay processing of the first fusible instruction only if the first fusible instruction is the last instruction stored in the IQ. 4. A method comprising: determining whether a currently accessed instruction within an instruction queue (IQ) is fusible with any subsequent instruction to be stored in the IQ; accessing a next instruction from the IQ and resetting a delay counter if it is determined that said currently accessed instruction is not fusible with any subsequent instruction to be stored in the IQ; and incrementing the delay counter if it is determined that said currently accessed instruction is fusible and if said currently accessed instruction is the last instruction in the IQ. 5. The method of claim 4 , further comprising fusing the currently accessed instruction with a given subsequent instruction if the currently accessed instruction and said given subsequent instruction are fusible and the delay counter has not reached a threshold value. 6. The method of claim 5 , further comprising processing said currently accessed instruction separately from said given subsequent instruction if said currently accessed instruction and said given subsequent instruction are not fusible. 7. The method of claim 5 , further comprising processing said currently accessed instruction separately from said given subsequent instruction if the delay counter has reached the threshold value. 8. A system comprising: a storage to store a first and second fusible instruction within a first and second access boundary, respectively; a processor having fetch logic to fetch the first and second fusible instructions into an instruction queue (IQ); a delay logic circuit to delay reading of the first fusible instruction from the IQ for a threshold amount of cycles; and an instruction fusion logic circuit to fuse the first and second fusible instructions if the second fusible instruction is stored in the IQ after the first fusible instruction and before the threshold amount of cycles has been reached, wherein the first fusible instruction is a compare or test (CMP/TEST) instruction and the second fusible instruction is a conditional jump (JCC) instruction. 9. A system comprising: a storage to store a first and second fusible instruction within a first and second access boundary, respectively; a processor having fetch logic to fetch the first and second fusible instructions into an instruction queue (IQ); a counter to delay reading of the first fusible instruction from the IQ for a threshold amount of cycles, the counter to increment if the first fusible instruction is the only instruction in the IQ and to stop counting when the threshold amount of cycles has been reached; and an instruction fusion logic circuit to fuse the first and second fusible instructions if the second fusible instruction is stored in the IQ after the first fusible instruction and before the threshold amount of cycles has been reached. 10. The system of claim 9 , wherein the counter is to reset if the second fusible instruction is stored in the IQ before the threshold amount of cycles has been reached. 11. The system of claim 9 , wherein the fusion logic circuit is to fuse the first and second fusible instructions before the threshold amount of cycles has been reached. 12. The system of claim 9 , wherein the storage includes an instruction cache and the first and second boundary sizes are each 64 bytes. 13. The system of claim 9 , wherein the storage includes a dynamic random-access memory and the first and second boundary sizes are each 4096 bytes. 14. The system of claim 9 , wherein the first fusible instruction is a compare or test (CMP/TEST) instruction and the second fusible instruction is a conditional jump (JCC) instruction. 15. The system of claim 14 , wherein the threshold amount of cycles is two.
with dedicated cache, e.g. instruction or stack · CPC title
of compound instructions · CPC title
Details of cache specific to multiprocessor cache arrangements · CPC title
Instruction code · CPC title
Decoding the operand specifier, e.g. specifier format · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.