Compiler for translating between a virtual image processor instruction set architecture (ISA) and target hardware having a two-dimensional shift array structure

US10095492B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10095492-B2
Application numberUS-201715591960-A
CountryUS
Kind codeB2
Filing dateMay 10, 2017
Priority dateApr 23, 2015
Publication dateOct 9, 2018
Grant dateOct 9, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method is described that includes translating higher level program code including higher level instructions having an instruction format that identifies pixels to be accessed from a memory with first and second coordinates from an orthogonal coordinate system into lower level instructions that target a hardware architecture having an array of execution lanes and a shift register array structure that is able to shift data along two different axis. The translating includes replacing the higher level instructions having the instruction format with lower level shift instructions that shift data within the shift register array structure.

First claim

Opening claim text (preview).

The invention claimed is: 1. A computer-implemented method comprising: receiving a first sequence of virtual instructions of a virtual instruction set of a virtual instruction set architecture, wherein the first sequence of virtual instructions includes one or more load instructions having two-dimensional relative addressing, wherein each load instruction comprises a two-dimensional relative address comprising an x-offset and a y-offset, wherein the two-dimensional relative address represents a location in a region of image data relative to a location in the region of image data associated with a virtual processor; receiving, by a compiler, a request to translate the first sequence of virtual instructions of the virtual instruction set into a second sequence of object code instructions of an object code instruction set, wherein object code instructions of the object code instruction set are executable by each processing element of a processor comprising a two-dimensional array of processing elements and a two-dimensional shift-register array, wherein each processing element has a dedicated shift register of the two-dimensional shift-register array, and wherein each processing element is configured to shift data in a respective shift register dedicated to the processing element to another shift register dedicated to another processing element; and generating, by the compiler, the second sequence of object code instructions by translating one or more of the load instructions in the virtual instruction set into one or more shift instructions of the object code instruction set, including translating a respective two-dimensional relative address of each load instruction into one or more corresponding shift offsets and generating the one or more shift instructions of the object code instruction set using the one or more respective shift offsets. 2. The method of claim 1 , wherein each shift offset specifies a direction within the two-dimensional shift-register array and a value representing how far data should be shifted along the specified direction within the two-dimension shift-register array. 3. The method of claim 1 , wherein translating a load instruction in the virtual instruction set into one or more shift instructions of the object code instruction set comprises: generating a first shift instruction having a first shift offset corresponding to the x-offset of the two-dimensional relative address of the load instruction; and generating a second shift instruction having a second shift offset corresponding to the y-offset of the two-dimensional relative address of the load instruction. 4. The method of claim 1 , wherein the first sequence of virtual instructions specifies an order in which each element of image data is accessed, and wherein translating the first sequence of virtual instructions into the second sequence of object code instructions comprises generating a sequence of shift instructions that modifies the order in which each element of image data is accessed. 5. The method of claim 4 , wherein the sequence of shift instructions, when executed by the processing elements, cause the processing elements to access elements of image data sequentially multiple times in a same direction. 6. The method of claim 1 , wherein the processor comprises a group of processing elements that are oriented along a same row or column in the two-dimensional array of processing elements, and wherein all processing elements in the group share a same memory unit such that only one processing element in the group is able to access the memory unit during a single clock cycle, and wherein translating the first sequence of virtual instructions into the second sequence of object code instructions comprises unrolling a single virtual memory access instruction into multiple object code memory access instructions that are executed sequentially by respective processing elements in the group. 7. The method of claim 6 , wherein each object code memory access instruction includes an opcode for a memory operation that specifies which processing element should access the memory unit when executing the object code memory access instruction. 8. The method of claim 6 , wherein generating, by the compiler, the second sequence of object code instructions comprises: generating an instruction having multiple instruction opcodes, wherein the instruction comprises: (1) a scalar opcode corresponding to a scalar operation to be performed by a scalar processor of the processor, and (2) a shift opcode corresponding to one or more shift operations to be performed by each of the processing elements in the two-dimensional array of processing elements. 9. The method of claim 8 , wherein the instruction, when executed by the scalar processor, causes the scalar processor to broadcast the shift opcode and a shift offset to each of the processing elements in the two-dimensional array of processing elements. 10. A system comprising: one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising: receiving a first sequence of virtual instructions of a virtual instruction set of a virtual instruction set architecture, wherein the first sequence of virtual instructions includes one or more load instructions having two-dimensional relative addressing, wherein each load instruction comprises a two-dimensional relative address comprising an x-offset and a y-offset, wherein the two-dimensional relative address represents a location in a region of image data relative to a location in the region of image data associated with a virtual processor; receiving, by a compiler, a request to translate the first sequence of virtual instructions of the virtual instruction set into a second sequence of object code instructions of an object code instruction set, wherein object code instructions of the object code instruction set are executable by each processing element of a processor comprising a two-dimensional array of processing elements and a two-dimensional shift-register array, wherein each processing element has a dedicated shift register of the two-dimensional shift-register array, and wherein each processing element is configured to shift data in a respective shift register dedicated to the processing element to another shift register dedicated to another processing element; and generating, by the compiler, the second sequence of object code instructions by translating one or more of the load instructions in the virtual instruction set into one or more shift instructions of the object code instruction set, including translating a respective two-dimensional relative address of each load instruction into one or more corresponding shift offsets and generating the one or more shift instructions of the object code instruction set using the one or more respective shift offsets. 11. The system of claim 10 , wherein each shift offset specifies a direction within the two-dimensional shift-register array and a value representing how far data should be shifted along the specified direction within the two-dimension shift-register array. 12. The system of claim 10 , wherein translating a load instruction in the virtual instruction set into one or more shift instructions of the object code instruction set comprises: generating a first shift instruction having a first shift offset corresponding to the x-offset of the two-dimensional relative address of the load instruction; and generating a second shift instruction having a second shift offset corresponding to the y-offset of the two-dimensional relative address o

Assignees

Inventors

Classifications

  • G06F8/451Primary

    Code distribution (considering CPU load at run-time G06F9/505; load rebalancing G06F9/5083) · CPC title

  • Reducing the execution time required by the program code · CPC title

  • Instruction analysis, e.g. decoding, instruction word fields · CPC title

  • Arrangements for executing specific machine instructions · CPC title

  • G06F8/441Primary

    Register allocation; Assignment of physical memory space to logical memory space · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10095492B2 cover?
A method is described that includes translating higher level program code including higher level instructions having an instruction format that identifies pixels to be accessed from a memory with first and second coordinates from an orthogonal coordinate system into lower level instructions that target a hardware architecture having an array of execution lanes and a shift register array structu…
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G06F8/451. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 09 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).