Facilitating congestion control in a network switch fabric based on group traffic rates
US-2016226772-A1 · Aug 4, 2016 · US
US10452401B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10452401-B2 |
| Application number | US-201715463511-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 20, 2017 |
| Priority date | Mar 20, 2017 |
| Publication date | Oct 22, 2019 |
| Grant date | Oct 22, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques are disclosed relating to selecting store instructions for dispatch to a shared pipeline. In some embodiments, the shared pipeline processes instructions for different target clients with different data rate capabilities. Therefore, in some embodiments, the pipeline is configured to generate state information that is based on a determined amount of work in the pipeline that targets at least one slower target. In some embodiments, the state information indicates whether the amount of work is above a threshold for the particular target. In some embodiments, scheduling circuitry is configured to select instructions for dispatch to the pipeline based on the state information. For example, the scheduling circuitry may refrain from selecting instructions with a slower target when the slower target is above its threshold amount of work in the pipeline. In some embodiments, the shared pipeline is a store pipeline configured to execute store instructions that target memories with different data rate capabilities.
Opening claim text (preview).
What is claimed is: 1. An apparatus, comprising: scheduling circuitry; first and second target storage elements, wherein the first target storage element is configured to receive data at a first data rate that is greater than a second data rate at which the second target storage element is configured to receive data; and store pipeline circuitry configured to: use the same hardware store pipeline to process store instructions to provide data to the first target storage element and store instructions to provide data to the second target storage element as specified by the processed store instructions; and generate state information that is based on a determined amount of work to be performed for store instructions in the hardware store pipeline that target the second target storage element; wherein the scheduling circuitry is configured to select store instructions for dispatch to the store pipeline circuitry based on the state information. 2. The apparatus of claim 1 , wherein the store pipeline circuitry is configured to determine the amount of work as a number of memory bus cycles to be used to perform store instructions in the hardware store pipeline that target the second target storage element. 3. The apparatus of claim 2 , wherein the store pipeline circuitry is configured to predict the number of memory bus cycles based on numbers of memory bus cycles used for one or more prior instructions that targeted the second target storage element. 4. The apparatus of claim 2 , wherein the determined amount of work is a number of memory bus cycles to be used for instructions in the hardware store pipeline that target the second target storage element divided by memory bus cycles to be used for instructions in the hardware store pipeline. 5. The apparatus of claim 1 , wherein the state information includes a bit that indicates, based on the determined amount of work, whether store instructions that target the second target storage element should be dispatched. 6. The apparatus of claim 1 , wherein the state information indicates whether the determined amount of work exceeds a threshold. 7. The apparatus of claim 6 , wherein the threshold is based on a ratio between the second data rate and the first data rate. 8. The apparatus of claim 1 , wherein the first target storage element is a local memory and the second target storage element is a cache in a global memory hierarchy. 9. The apparatus of claim 1 , wherein the scheduling circuitry includes first and second queues for store instructions that target the respective first and second target storage elements and wherein the scheduling circuitry is configured to select a queue to provide one or more store instructions for dispatch in a given cycle based on the state information. 10. A method, comprising: executing, by the same hardware store pipeline of store pipeline circuitry, store instructions that respectively target a first storage element and a second storage element, wherein the first storage element is configured to receive data at a first data rate that is greater than a second data rate at which the second storage elements is configured to receive data; generating, by the store pipeline circuitry, state information that is based on a determined amount of work to be performed for store instructions in the hardware store pipeline that target the second storage element; selecting, by scheduling circuitry, store instructions for dispatch to the store pipeline circuitry based on the state information. 11. The method of claim 10 , wherein the amount of work is determined as a number of memory bus cycles for store instructions in the hardware store pipeline that target the second storage element. 12. The method of claim 10 , further comprising predicting the amount of work based on prior work involved in execution of completed store instructions that targeted the second storage element. 13. The method of claim 12 , further comprising: determining, by the store pipeline, respective numbers of bus cycles needed to perform ones of the store instructions. 14. The method of claim 10 , wherein the amount of work is a percentage of total work in the hardware store pipeline. 15. The method of claim 10 , wherein the state information includes a bit that indicates whether store instructions that target the second storage element should be dispatched. 16. A non-transitory computer readable storage medium having stored thereon design information that specifies a design of at least a portion of a hardware integrated circuit in a format recognized by a semiconductor fabrication system that is configured to use the design information to produce the circuit according to the design, including: scheduling circuitry; first and second storage elements, wherein the first storage element is configured to receive data at a first data rate that is greater than a second data rate of the second storage element; and store pipeline circuitry configured to: use the same hardware store pipeline to process store instructions to provide data to the first storage element and store instructions to provide data to the second storage element; and generate state information that is based on a determined amount of work to be performed for store instructions in the hardware store pipeline that target the second storage element; wherein the design information specifies that the scheduling circuitry is configured to select store instructions for dispatch to the store pipeline circuitry based on the state information. 17. The non-transitory computer readable storage medium of claim 16 , wherein the design information further specifies that the store pipeline circuitry is configured to determine the amount of work as a number of memory bus cycles to be used to perform store instructions in the hardware store pipeline that target the second storage element. 18. The non-transitory computer readable storage medium of claim 17 , the store pipeline circuitry is configured to predict the number of memory bus cycles based on numbers of memory bus cycles used for one or more completed instructions that targeted the second storage element. 19. The non-transitory computer readable storage medium of claim 17 , wherein the number of memory bus cycles is a predicted number. 20. The non-transitory computer readable storage medium of claim 16 , wherein the first storage element is configured to store pixel data and the second storage element is configured to store vertex data.
Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution · CPC title
Operand accessing · CPC title
using instruction pipelines · CPC title
LOAD or STORE instructions; Clear instruction · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.