Block Operations For An Image Processor Having A Two-Dimensional Execution Lane Array and A Two-Dimensional Shift Register
US-2018007302-A1 · Jan 4, 2018 · US
US10908899B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10908899-B2 |
| Application number | US-201916376557-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 5, 2019 |
| Priority date | Apr 12, 2018 |
| Publication date | Feb 2, 2021 |
| Grant date | Feb 2, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A code conversion apparatus includes a memory and a processor coupled to the memory. The memory is configured to store therein a first code including a first data definition of a plurality of arrays, a first operation for the plurality of arrays, and a second data definition of an array indicating a result of the first operation. The processor is configured to convert the first data definition and the second data definition included in the first code into a data definition of an array of structures. The processor is configured to convert the first operation included in the first code into a second operation for the array of structures. The processor is configured to generate a second code including a predetermined instruction to perform the second operation on different pieces of data of the plurality of arrays in parallel with one another.
Opening claim text (preview).
What is claimed is: 1. A code conversion apparatus, comprising: a memory configured to store therein a first code including a first data definition of a plurality of arrays, a first operation for the plurality of arrays, and a second data definition of an array indicating a result of the first operation; and a processor coupled to the memory and the processor configured to: convert the first data definition and the second data definition included in the first code into a data definition of an array of structures; convert the first operation included in the first code into a second operation for the array of structures; and generate a second code including a predetermined instruction to perform the second operation on different pieces of data of the plurality of arrays in parallel with one another. 2. The code conversion apparatus according to claim 1 , wherein the predetermined instruction includes an instruction to perform the second operation on data of a plurality of arrays stored in a plurality of registers and write data of an array indicating a result of the second operation into a predetermined register, and the second code further includes: an instruction to read data of a plurality of arrays of structures successively stored in the memory and write data of the plurality of arrays included in each of the plurality of arrays of structures into the plurality of registers; and an instruction to read the data of the array indicating the result of the second operation from the predetermined register and write the read data to the memory at a position of the data of the array indicating the result of the second operation, wherein the data of the array indicating the result of the second operation is included in each of the plurality of arrays of structures stored in the memory. 3. The code conversion apparatus according to claim 1 , wherein the predetermined instruction includes an instruction to perform the second operation on data of arrays stored in a plurality of registers and write data of an array indicating a result of the second operation into a predetermined register, and the second code further includes: an instruction to read data of a plurality of arrays of structures successively stored in the memory and successively write the read data into a first register; an instruction to specify positions of data of a same array included in data of each of the plurality of arrays of structures, to read data at the specified positions from the first register, and to successively write the read data into one of the plurality of registers; and an instruction to read the data of the array indicating the result of the second operation from the predetermined register, specify a position of the data of the array indicating the result of the second operation among data of each of the plurality of arrays of structures stored in the memory, and write the data read from the predetermined register at the specified position into the memory. 4. The code conversion apparatus according to claim 1 , wherein the processor is further configured to: select, in accordance with a compiler option indicating conversion of a data definition, the plurality of arrays and the array indicating the result of the first operation from the arrays defined in the first code. 5. The code conversion apparatus according to claim 1 , wherein the first code includes a control statement for specifying the plurality of arrays and the array indicating the result of the first operation, and the processor is further configured to: select, in accordance with the control statement, the plurality of arrays and the array indicating the result of the first operation from the arrays defined in the first code. 6. The code conversion apparatus according to claim 1 , wherein the processor is further configured to: select, using profile information indicating an access frequency of each of the arrays defined in the first code, the plurality of arrays and the array indicating the result of the first operation from the arrays defined in the first code. 7. The code conversion apparatus according to claim 1 , wherein the processor is further configured to: select the plurality of arrays from among arrays within a loop included in the first code based on a number of appearances of each array in the loop or a number of appearances of each array in a group of arrays having a same subscript in the loop. 8. The code conversion apparatus according to claim 1 , wherein the first data definition included in the first code is a data definition of a structure of arrays. 9. A method for improving performance in computer operations, the method comprising: converting, by a computer, a first data definition, included in a first code, of a plurality of data arrays into a second data definition of an array of structures (AoS), wherein the plurality of data arrays include data pieces stored in a memory; converting a first operation, also included in the first code, on the plurality of data arrays into a second operation for the AoS; generating a second code to perform the second operation on the plurality of data arrays based on the second data definition of the AoS; and enabling the second code to execute a process including: successively reading the data pieces from the memory; loading the read data pieces into a plurality of registers in a vertical direction, with each successive data piece loading into a different one of the plurality of registers; performing the second operation involving data pieces loaded in at least two of plurality of registers in parallel with one another; writing resulting data indicating a result of the second operation in another one of plurality of registers that is different from the at least two registers; and storing the resulting data from the another register back in the memory at a predetermined location different from locations used to store the data pieces involved in the second operation. 10. A non-transitory computer-readable recording medium having stored therein a program that causes a computer to execute a process, the process comprising: converting a first data definition of a plurality of arrays and a second data definition of an array indicating a result of a first operation for the plurality of arrays into a data definition of an array of structures, wherein the first data definition, the second data definition, and the first operation are included in a first code; converting the first operation included in the first code into a second operation for the array of structures; and generating a second code including a predetermined instruction to perform the second operation on different pieces of data of the plurality of arrays in parallel with one another. 11. The non-transitory computer-readable recording medium according to claim 10 , wherein the predetermined instruction is an instruction to perform the second operation on data of arrays stored in a plurality of registers and write data of an array indicating a result of the second operation into a predetermined register, and the second code further includes: an instruction to read data of a plurality of arrays of structures successively stored in the memory and write data of the plurality of arrays included in each of the plurality of arrays of structures into the plurality of registers; and an instruction to read the data of the array indicating the result of the second operation from the predetermined register and write the read data to the memory at a position of the data of the array indicating the result of the second operation included in each of the plurality of arrays of structures stored in the memory.
controlled by a single instruction for multiple data lanes [SIMD] · CPC title
Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title
Special purpose registers · CPC title
having multiple operands in a single register · CPC title
Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.