Technology for dynamically tuning processor features
US-10915421-B1 · Feb 9, 2021 · US
US11656971B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11656971-B2 |
| Application number | US-202217582051-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 24, 2022 |
| Priority date | Sep 19, 2019 |
| Publication date | May 23, 2023 |
| Grant date | May 23, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A processor comprises a microarchitectural feature and dynamic tuning unit (DTU) circuitry. The processor executes a program for first and second execution windows with the microarchitectural feature disabled and enabled, respectively. The DTU circuitry automatically determines whether the processor achieved worse performance in the second execution window. In response to determining that the processor achieved worse performance in the second execution window, the DTU circuitry updates a usefulness state for a selected address of the program to denote worse performance. In response to multiple consecutive determinations that the processor achieved worse performance with the microarchitectural feature enabled, the DTU circuitry automatically updates the usefulness state to denote a confirmed bad state. In response to the usefulness state denoting the confirmed bad state, the DTU circuitry automatically disables the microarchitectural feature for the selected address for execution windows after the second execution window. Other embodiments are described and claimed.
Opening claim text (preview).
What is claimed is: 1. A processor comprising: a first cache; a second cache coupled to the first cache; an arithmetic logic unit (ALU) to perform arithmetic operations; a circuit coupled to the ALU, wherein after the processor has executed a workload for a first execution window with a microarchitectural feature disabled and for a second execution window with the microarchitectural feature enabled, the circuit is to: determine whether the processor achieved worse performance in the second execution window, relative to the first execution window; and in response to a determination that the processor achieved the worse performance in the second execution window, update a state for an address associated with an instruction towards a bad final state, wherein when the state for the address reaches the bad final state, the processor is to disable the microarchitectural feature for the address associated with the instruction. 2. The processor of claim 1 , wherein the circuit, in response to at least two consecutive determinations that the processor achieved the worse performance with the microarchitectural feature enabled, is to update the state for the address associated with the instruction to the bad final state. 3. The processor of claim 1 , wherein the circuit comprises a finite state machine. 4. The processor of claim 1 , further comprising memory to store a table, the table comprising a plurality of entries, wherein each of the plurality of entries comprises: an address field to store address information of an address of an instruction; a state field to store a state for the address of the instruction; and a counter field to store an involved count associated with the instruction. 5. The processor of claim 4 , wherein the circuit is to reset at least the state field of the plurality of entries of the table periodically. 6. The processor of claim 4 , wherein the circuit is to initialize the state for the address of the instruction to a neutral state. 7. The processor of claim 1 , wherein the circuit is to control the state for the address associated with the instruction to be one of the following states: a good final state, a good state, a neutral state, a bad state, and the bad final state. 8. The processor of claim 1 , further comprising a retired instruction counter to maintain a count of retired instructions. 9. The processor of claim 8 , further comprising: a current cycle counter to maintain a first count of instructions of the first execution window; and a previous cycle counter to maintain a second count of instructions of the second execution window, wherein the circuit is to determine the worse performance using the first count of instructions and the second count of instructions. 10. The processor of claim 1 , wherein the microarchitectural feature comprises branch predication. 11. A method comprising: determining, in a circuit of a processor, whether the processor achieved worse performance in a second execution window, relative to a first execution window, wherein a microarchitectural feature of the processor is disabled for the first execution window and is enabled for the second execution window; in response to a determination that the processor achieved the worse performance in the second execution window, updating a state for an address associated with an instruction towards a bad final state; and when the state for the address reaches the bad final state, disabling the microarchitectural feature for the address associated with the instruction. 12. The method of claim 11 , further comprising in response to at least two consecutive determinations that the processor achieved the worse performance with the microarchitectural feature enabled, updating the state for the address associated with the instruction to the bad final state. 13. The method of claim 11 , further comprising storing a table in a memory, the table comprising a plurality of entries, wherein each of the plurality of entries comprises: an address field to store address information of an address of an instruction; a state field to store a state for the address of the instruction; and a counter field to store an involved count associated with the instruction. 14. The method of claim 13 , further comprising resetting at least the state field of the plurality of entries of the table periodically. 15. The method of claim 13 , further comprising initializing the state for the address of the instruction to a neutral state. 16. A non-transitory machine-readable medium comprising at least one instruction, which when executed by a processor, causes the processor to: determine whether the processor achieved worse performance in a second execution window, relative to a first execution window, wherein a microarchitectural feature of the processor is disabled for the first execution window and is enabled for the second execution window; in response to a determination that the processor achieved the worse performance in the second execution window, update a state for an address associated with an instruction towards a bad final state; and when the state for the address reaches the bad final state, disable the microarchitectural feature for the address associated with the instruction. 17. The non-transitory storage machine-readable medium of claim 16 , wherein the at least one instruction, when executed by the processor, causes the processor, in response to at least two consecutive determinations that the processor achieved the worse performance with the microarchitectural feature enabled, to update the state for the address associated with the instruction to the bad final state. 18. The non-transitory storage machine-readable medium of claim 16 , wherein the at least one instruction, when executed by the processor, causes the processor to store a table in a memory, the table comprising a plurality of entries, wherein each of the plurality of entries comprises: an address field to store address information of an address of an instruction; a state field to store a state for the address of the instruction; and a counter field to store an involved count associated with the instruction. 19. The non-transitory storage machine-readable medium of claim 18 , wherein the at least one instruction, when executed by the processor, causes the processor to reset at least the state field of the plurality of entries of the table periodically. 20. The non-transitory storage machine-readable medium of claim 16 , wherein the at least one instruction, when executed by the processor, causes the processor to disable the microarchitectural feature comprising branch predication.
Monitoring arrangements for monitoring the status of the computing system or of the computing system component, e.g. monitoring if the computing system is on, off, available, not available (error or fault processing without redundancy G06F11/0703; error detection or correction by redundancy in data representation G06F11/08; error detection or correction of the data by redundancy in operations G06F11/14; error detection or correction by redundancy in hardware G06F11/16) · CPC title
Monitoring specific for caches · CPC title
Data logging (G06F11/14, G06F11/2205 take precedence) · CPC title
for multiple contexts · CPC title
where the computing system component is a central processing unit [CPU] · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.