Very-long instruction word (VLIW) processor and compiler for executing instructions in parallel
US-9697004-B2 · Jul 4, 2017 · US
US11900124B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11900124-B2 |
| Application number | US-202318092712-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 3, 2023 |
| Priority date | May 24, 2013 |
| Publication date | Feb 13, 2024 |
| Grant date | Feb 13, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Various embodiments are disclosed of a multiprocessor system with processing elements optimized for high performance and low power dissipation and an associated method of programming the processing elements. Each processing element may comprise a fetch unit and a plurality of address generator units and a plurality of pipelined datapaths. The fetch unit may be configured to receive a multi-part instruction, wherein the multi-part instruction includes a plurality of fields. First and second address generator units may generate, based on different fields of the multi-part instruction, addresses from which to retrieve first and second data for use by an execution unit for the multi-part instruction or a subsequent multi-part instruction. The execution units may perform operations using a single pipeline or multiple pipelines based on third and fourth fields of the multi-part instruction.
Opening claim text (preview).
What is claimed is: 1. An apparatus, comprising: one or more execution units that include multiple pipelines; a fetch unit configured to fetch a multi-part instruction from an instruction stream, wherein the multi-part instruction includes a plurality of fields; and a plurality of address generator units; wherein a first address generator unit of the plurality of address generator units is configured to generate, based on a first field of the multi-part instruction, an address from which to retrieve first data to be stored in a first register for use by an execution unit for the multi-part instruction or a subsequent multi-part instruction; wherein a second address generator unit of the plurality of address generator units is configured to generate, based on a second field of the multi-part instruction, an address from which to retrieve second data to be stored in a second register for use by an execution unit for the multi-part instruction or a subsequent multi-part instruction; and wherein the one or more execution units are configured to: perform a first operation, operating on operands from a plurality of registers, using a first pipeline of the multiple pipelines, based on a third field of the plurality of fields; and perform a second operation, operating on operands from the plurality of registers, using at least two pipelines of the multiple pipelines in parallel, based on a fourth field of the plurality of fields, wherein performing the second operation using the at least two pipelines in parallel comprises generating, by a second pipeline of the at least two pipelines, a first set of partial products and generating, by a third pipeline of the at least two pipelines, a second set of partial products dependent upon at least one partial product of the first set of partial products. 2. The apparatus of claim 1 , wherein the first and second address generator units are independently controlled. 3. The apparatus of claim 1 , wherein the first and second operations are for different threads of execution. 4. The apparatus of claim 1 , wherein an execution unit is configured to forward a result operand for storage in a register. 5. The apparatus of claim 1 , wherein the first address generator unit is configured to generate an address at which to store a result operand generated by an execution unit. 6. The apparatus of claim 1 , wherein generating the first set of partial products comprises compressing a first number of partial products to a second number of partial products, wherein the second number is less than the first number. 7. The apparatus of claim 1 , wherein each pipeline of the multiple pipelines comprises a plurality of respective multiplier units and adder units. 8. A method, comprising: fetching, by a fetch unit of a processor, a multi-part instruction from an instruction stream, wherein the multi-part instruction includes a plurality of fields; generating, by a first address generator unit of the processor based on a first field of the multi-part instruction, an address from which to retrieve first data to be stored in a first register for use by an execution unit for the multi-part instruction or a subsequent multi-part instruction; generating by a second address generator unit of the processor based on a second field of the multi-part instruction, an address from which to retrieve second data to be stored in a second register for use by an execution unit for the multi-part instruction or a subsequent multi-part instruction; performing a first operation that operates on operands from a plurality of registers, by one or more execution units, using a first pipeline of a plurality of pipelines, based on a third field of the plurality of fields; and performing a second operation that operates on operands from a plurality of registers, by one or more execution units, using at least two pipelines of the plurality of pipelines in parallel, based on a fourth field of the plurality of fields, wherein performing the second operation using the at least two pipelines in parallel comprises generating, by a second pipeline of the at least two pipelines, a first set of partial products and generating, by a third pipeline of the at least two pipelines, a second set of partial products dependent upon at least one partial product of the first set of partial products. 9. The method of claim 8 , wherein the first data and the second data are input operands for different threads of execution. 10. The method of claim 8 , further comprising forwarding, by an execution unit, a result operand for storage in one of multiple datapath registers. 11. The method of claim 8 , further comprising: independently performing, by two or more execution units of the processor, first and second respective math operations based on different fields of a multi-part instruction. 12. The method of claim 8 , wherein generating the first set of partial products comprises compressing a first number of partial products to a second number of partial products, wherein the second number is less than the first number. 13. The method of claim 8 , wherein each pipeline of the multiple pipelines comprises a plurality of respective multiplier units and adder units. 14. A non-transitory computer-readable medium having instructions stored thereon that are executable by a computing device to perform operations comprising: fetching a multi-part instruction from an instruction stream, wherein the multi-part instruction includes a plurality of fields; generating, based on a first field of the multi-part instruction, an address from which to retrieve first data to be stored in a first register for use by an execution unit for the multi-part instruction or a subsequent multi-part instruction; generating, based on a second field of the multi-part instruction, an address from which to retrieve second data to be stored in a second register for use by an execution unit for the multi-part instruction or a subsequent multi-part instruction; performing a first operation that operates on operands from a plurality of registers, based on a third field of the plurality of fields; and performing a second operation that operates on operands from a plurality of registers using at least two pipelines of a plurality of pipelines in parallel, based on a fourth field of the plurality of fields, wherein performing the second operation using the at least two pipelines in parallel comprises generating, by a first pipeline of the at least two pipelines, a first set of partial products and generating, by a second pipeline of the at least two pipelines, a second set of partial products dependent upon at least one partial product of the first set of partial products. 15. The non-transitory computer-readable medium of claim 14 , wherein the multi-part instruction indicates independent control of first and second address generator units that generate the addresses from which to retrieve the first and second data. 16. The non-transitory computer-readable medium of claim 14 , wherein the first and second operations are for different threads of execution. 17. The non-transitory computer-readable medium of claim 14 , wherein the first and second operations further comprise generating, based on the multi-part instruction, an address at which to store a result operand generated by an execution unit. 18. The non-transitory computer-readable medium of claim 14 , wherein generating the first set of partial products comprises compressing a first number of partial products to a second number of partial products, wherein
to perform operations for flow control · CPC title
for indirect branch instructions · CPC title
from multiple instruction streams, e.g. multistreaming · CPC title
for complex operations, e.g. multidimensional or interleaved address generators, macros · CPC title
Power or thermal control instructions · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.