Method and apparatus for providing hardware support for self-modifying code
US-2015324213-A1 · Nov 12, 2015 · US
US9891915B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9891915-B2 |
| Application number | US-201414281663-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 19, 2014 |
| Priority date | Mar 15, 2013 |
| Publication date | Feb 13, 2018 |
| Grant date | Feb 13, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A microprocessor implemented method for resolving dependencies for a load instruction in a load store queue (LSQ) is disclosed. The method comprises initiating a computation of a virtual address corresponding to the load instruction in a first clock cycle. It also comprises transmitting early calculated lower address bits of the virtual address to a load store queue (LSQ) in the same cycle as the initiating. Finally, it comprises performing a partial match in the LSQ responsive to and using the lower address bits to find a prior aliasing store, wherein the prior aliasing store stores to a same address as the load instruction.
Opening claim text (preview).
What is claimed is: 1. A microprocessor implemented method for resolving dependencies for a load instruction in a load store queue (LSQ), said method comprising: initiating a computation of a virtual address corresponding to said load instruction in a first clock cycle; transmitting early calculated lower address bits of said virtual address to a load store queue (LSQ) in said first clock cycle, wherein said early calculated lower address bits are computed earlier and faster than upper bits of said virtual address to allow for earlier access times in a pipeline of said microprocessor; performing a partial match in said LSQ responsive to and using said lower address bits to find a prior aliasing store, wherein said prior aliasing store stores to a same address as said load instruction; and responsive to a determination that there is a partial match in said LSQ, storing a set of partially matched entries from said LSQ in a memory and performing a look-up on said set of partially matched entries in a second clock cycle. 2. The method of claim 1 further comprising: performing a prediction that said load instruction has a prior aliasing store in said LSQ; and responsive to a determination that there is no partial match in said LSQ and no prediction is available, retrieving data corresponding to said load instruction from a data cache memory in said second clock cycle. 3. The method of claim 2 , wherein said prediction is based on prior instances of store-to-load forwarding for said load instruction. 4. The method of claim 2 , further comprising: responsive to said determination that there is a partial match in said LSQ, waiting for said virtual address to fully compute to perform said look-up on said set of partially matched entries in said second clock cycle. 5. The method of claim 1 , further comprising: routing said lower address bits to said LSQ using a higher metal route relative to other bits of said virtual address. 6. The method of claim 1 , wherein said performing and said initiating are performed in a same cycle. 7. A processor unit configured to perform operations for resolving dependencies for a load instruction in a load store queue (LSQ), said operations comprising: initiating a computation of a virtual address corresponding to said load instruction in a first clock cycle; transmitting early calculated lower address bits of said virtual address to a load store queue (LSQ) in said first clock cycle, wherein said early calculated lower address bits are computed earlier and faster than upper bits of said virtual address to allow for earlier access times in a pipeline of said processor; performing a partial match in said LSQ responsive to and using said lower address bits to find a prior aliasing store, wherein said prior aliasing store stores to a same address as said load instruction; and responsive to a determination that there is a partial match in said LSQ, storing a set of partially matched entries from said LSQ in a memory and performing a look-up on said set of partially matched entries in a second clock cycle. 8. The processor unit of claim 7 , wherein said operations further comprise: performing a prediction that said load instruction has a prior aliasing store in said LSQ; and responsive to a determination that there is no partial match in said LSQ and no prediction is available, retrieving data corresponding to said load instruction from a data cache memory in said second clock cycle. 9. The processor unit of claim 8 , wherein said prediction is based on prior instances of store-to-load forwarding for said load instruction. 10. The processor unit of claim 8 , wherein said operations further comprise: responsive to said determination that there is a partial match in said LSQ, waiting for said virtual address to fully compute to perform said look-up on said set of partially matched entries in said second clock cycle. 11. The processor unit of claim 7 , wherein said operations further comprise: routing said lower address bits to said LSQ using a higher metal route relative to other bits of said virtual address. 12. The processor unit of claim 7 , wherein said performing and said initiating are performed in a same cycle. 13. An apparatus configured to resolve dependencies for a load instruction in a load store queue (LSQ), said apparatus comprising: a memory; a processor communicatively coupled to said memory, wherein said processor is configured to process instructions out of order, and further wherein said processor is configured to perform operations comprising: initiating a computation of a virtual address corresponding to said load instruction in a first clock cycle; transmitting early calculated lower address bits of said virtual address to a load store queue (LSQ) in said first clock cycle, wherein said early calculated lower address bits are computed earlier and faster than upper bits of said virtual address to allow for earlier access times in a pipeline of said processor; performing a partial match in said LSQ responsive to and using said lower address bits to find a prior aliasing store, wherein said prior aliasing store stores to a same address as said load instruction; and responsive to a determination that there is a partial match in said LSQ, storing a set of partially matched entries from said LSQ in a storage and performing a look-up on said set of partially matched entries in a second clock cycle. 14. The apparatus of claim 13 , wherein said operations further comprise: performing a prediction that said load instruction has a prior aliasing store in said LSQ; and responsive to a determination that there is no partial match in said LSQ and no prediction is available, retrieving data corresponding to said load instruction from a data cache memory in said second clock cycle. 15. The apparatus of claim 14 , wherein said prediction is based on prior instances of store-to-load forwarding for said load instruction. 16. The apparatus of claim 14 , wherein said operations further comprise: responsive to said determination that there is a partial match in said LSQ, waiting for said virtual address to fully compute to perform said look-up on said set of partially matched entries in said second clock cycle. 17. The apparatus of claim 13 , wherein said operations further comprise: routing said lower address bits to said LSQ using a higher metal route relative to other bits of said virtual address. 18. The apparatus of claim 13 , wherein said performing and said initiating are performed in a same cycle.
LOAD or STORE instructions; Clear instruction · CPC title
the data cache being concurrently physically addressed · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.